Uploaded image for project: 'OpenShift Node'
  1. OpenShift Node
  2. OCPNODE-1818

Debug the failures while enabling EventedPLEG featuregate in Kubelet

XMLWordPrintable

    • Icon: Spike Spike
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • BU Product Work
    • 5
    • False
    • None
    • False
    • OCPSTRAT-296 - Openshift Kubelet: Pod Lifecycle Event Generator (PLEG)
    • OCPNODE Sprint 242 (Blue)

      Enabling the evented pleg featuregate via the machine config operator is resulting in the pods going into "CrashLoopBackOff" or "Error" state.
      MCO Branch: https://github.com/openshift/machine-config-operator/pull/3917/files 

      Reason for the pods going into the CrashLoopBackOff state is that there are duplicate containers getting created, started within the pod and hence racing out for acquiring the resources (ports).

      Ex: "bind: address already in use" error observed on many pods.

      ci-ln-09tlpi2-72292-flgxf-master-2.log:264184:Sep 18 15:39:10.886241 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]:         time="2023-09-18T15:39:01Z" level=fatal msg="failed to create listener: failed to listen on 0.0.0.0:5443: listen tcp 0.0.0.0:5443: bind: address already in use"
      ci-ln-09tlpi2-72292-flgxf-master-2.log:264236:Sep 18 15:39:11.154052 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]:         F0918 15:38:21.629025       1 cmd.go:56] failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use
      ci-ln-09tlpi2-72292-flgxf-master-2.log:265167:Sep 18 15:39:21.155012 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]:         F0918 15:38:23.445192       1 standalone_apiserver.go:120] listen tcp 0.0.0.0:8443: bind: address already in use
      ci-ln-09tlpi2-72292-flgxf-master-2.log:265182:Sep 18 15:39:21.155012 ci-ln-09tlpi2-72292-flgxf-master-2 kubenswrapper[2308]:         E0918 15:38:54.680976       1 run.go:74] "command failed" err="failed to run groups: failed to listen on secure address: listen tcp :8443: bind: address already in use"

      The above issue needs to be root caused.

        1. ci-ln-09tlpi2-72292-flgxf-master-2.log
          68.08 MB
        2. crio.ign
          5 kB
        3. image_config.yaml
          0.1 kB

              svanka@redhat.com Sai Ramesh Vanka
              svanka@redhat.com Sai Ramesh Vanka
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: