Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36867

Static pod controller pods sometimes fail to start [etcd]

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • 4.17, 4.18
    • Etcd
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 5
    • Important
    • No
    • None
    • Rejected
    • ETCD Sprint 259, ETCD Sprint 260, ETCD Sprint 261, ETCD Sprint 262, ETCD Sprint 263
    • 5
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      deads reported in this thread that the static pod controller appears to sometimes deploy pods that do not show up in a reasonable timeframe, which occasionally triggers this test to fail (source job):

      [sig-node] static pods should start after being created 
      
      {  static pod lifecycle failure - static pod: "etcd" in namespace: "openshift-etcd" for revision: 7 on node: "ci-op-h9zjcc96-51425-8gcc2-master-0" didn't show up, waited: 3m0s}
      

      David suspects that this actually happens far more often than the test failures indicate, however this test should be a good resource to find affected runs.

      Test details indicates this fails up to 10% of the time on some job variants. The most common compnent affected appears to be kube-controller-manager, but apiserver and etcd are both appearing at times. Use the test details link if looking for more job runs.

      Slack thread has more details from both deads@redhat.com and tjungblu@redhat.com.

      Suspicion is that fixing this could improve install times and reliability.

        1. image-2024-08-02-12-31-04-963.png
          image-2024-08-02-12-31-04-963.png
          86 kB
        2. image-2024-08-02-12-32-01-567.png
          image-2024-08-02-12-32-01-567.png
          78 kB
        3. image-2024-08-02-12-38-20-669.png
          image-2024-08-02-12-38-20-669.png
          80 kB
        4. image-2024-08-02-12-39-20-189.png
          image-2024-08-02-12-39-20-189.png
          81 kB
        5. image-2024-08-02-12-46-28-665.png
          image-2024-08-02-12-46-28-665.png
          125 kB
        6. image-2024-08-02-12-46-51-229.png
          image-2024-08-02-12-46-51-229.png
          82 kB
        7. HA 4.17.png
          HA 4.17.png
          291 kB
        8. SNO 4.17.png
          SNO 4.17.png
          247 kB
        9. HA 4.18.png
          HA 4.18.png
          278 kB
        10. SNO 4.18.png
          SNO 4.18.png
          298 kB

              rhn-coreos-htariq Haseeb Tariq
              rhn-engineering-dgoodwin Devan Goodwin
              None
              None
              Ge Liu Ge Liu
              None
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

                Created:
                Updated:
                Resolved: