Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17475

OCP 4.12.21 upgrade - ovnkube-master container in crash-loop, cannot maintain leader elect

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Normal Normal
    • None
    • 4.12
    • apiserver-auth
    • Moderate
    • No
    • 3
    • SDN Sprint 240, SDN Sprint 241
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Customer Escalated

      Description of problem:

      Issue: ovn-kubernetes-master pods are in crash-loop back-off state and forcing new leader elections every few seconds/minutes.
      cluster is degraded, operators impacted.
      
      Kube-apiserver is reporting a requeuing error indcating it's likely flooded
      
      openshift-multus has a scheduler pod down, that cannot bind an IP with error: "error adding container to network ovn-kubernetes cni request failed with status 400" --> https://access.redhat.com/solutions/6646561 [restarting pods/databases, does not mitigate issue]
      
      

      Version-Release number of selected component (if applicable):

      4.12.21

      How reproducible:

      customer-specific

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

      cluster upgrade should not stall/crash-loop ovnkube-master pods

      Additional info:

      next update (internal) will include logs/sample data + troubleshooting taken so far. Impact is high for customer, production down. 

            slaznick@redhat.com Stanislav Láznička
            rhn-support-wrussell Will Russell
            Anurag Saxena Anurag Saxena
            Jaime Caamaño Ruiz, Tim Rozet
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: