-
Spike
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
BU Product Work
-
5
-
False
-
None
-
False
-
OCPSTRAT-296 - Openshift Kubelet: Pod Lifecycle Event Generator (PLEG)
-
-
-
OCPNODE Sprint 242 (Blue)
Enabling the evented pleg featuregate via the machine config operator is resulting in the pods going into "CrashLoopBackOff" or "Error" state.
MCO Branch: https://github.com/openshift/machine-config-operator/pull/3917/files
Reason for the pods going into the CrashLoopBackOff state is that there are duplicate containers getting created, started within the pod and hence racing out for acquiring the resources (ports).
Ex: "bind: address already in use" error observed on many pods.
ci-ln-09tlpi2-72292-flgxf-master-2.log:264184:Sep 18 15:39:10.886241 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]: time="2023-09-18T15:39:01Z" level=fatal msg="failed to create listener: failed to listen on 0.0.0.0:5443: listen tcp 0.0.0.0:5443: bind: address already in use" ci-ln-09tlpi2-72292-flgxf-master-2.log:264236:Sep 18 15:39:11.154052 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]: F0918 15:38:21.629025 1 cmd.go:56] failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use ci-ln-09tlpi2-72292-flgxf-master-2.log:265167:Sep 18 15:39:21.155012 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]: F0918 15:38:23.445192 1 standalone_apiserver.go:120] listen tcp 0.0.0.0:8443: bind: address already in use ci-ln-09tlpi2-72292-flgxf-master-2.log:265182:Sep 18 15:39:21.155012 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]: E0918 15:38:54.680976 1 run.go:74] "command failed" err="failed to run groups: failed to listen on secure address: listen tcp :8443: bind: address already in use"
The above issue needs to be root caused.
- clones
-
OCPNODE-1704 Investigate e2e test failures due to enabling evented pleg
- Closed
- is cloned by
-
OCPNODE-1845 [UPSTREAM] Fix Evented PLEG issue in Kubelet
- Closed