Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4183

Upgrades from 4.11.9 to latest 4.12.x Nightly builds do not succeed

XMLWordPrintable

    • Important
    • None
    • Approved
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-2532. The following is the description of the original issue:

      Description of problem:

      Upgrades from OCP 4.11.9 to the latest OCP 4.12 Nightly builds including 4.12.0-ec.4 will fail.  When the upgrade fails, there are typically two operators that never get upgraded(all others do upgrade to the targeted 4.12.x release):
      
      dns                                        4.11.9                                     True        True          False      11h     DNS "default" reports Progressing=True: "Have 4 available DNS pods, want 5."...
      machine-config                             4.11.9                                     True        False         False      14h
      
      The dns.operator details state it is waiting for a 4/5 pods to become available:
      # oc describe dns.operator/default
      ...
      Status:
        Cluster Domain:  cluster.local
        Cluster IP:      172.30.0.10
        Conditions:
          Last Transition Time:  2022-10-18T03:21:44Z
          Message:               Enough DNS pods are available, and the DNS service has a cluster IP address.
          Reason:                AsExpected
          Status:                False
          Type:                  Degraded
          Last Transition Time:  2022-10-18T03:21:44Z
          Message:               Have 4 available DNS pods, want 5.
          Reason:                Reconciling
          Status:                True
          Type:                  Progressing
      
      The mcp reports everything is good:
      # oc get mcp
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      master   rendered-master-87fd457ffdaf49d75e62b532c22a9f1d   True      False      False      3              3                   3                     0                      14h
      worker   rendered-worker-7fc68009b1facf8724cd952cb08435ff   True      False      False      2              2                   2                     0                      14h
      
      We have performed a large number of the same upgrades, using the same configuration, and while there are times the upgrade succeeds, the large number of results do fail.  This seems to be a timing issue.  
      
      As a current workaround, if we were to recycle the control plane nodes, the upgrade will complete successfully. 
      
      A must-gather log is attached for review.

      Version-Release number of selected component (if applicable):

      Tested upgrading to all the following releases:
      4.12.0-ec.4
      4.12.0-0.nightly-s390x-2022-10-10-005931
      4.12.0-0.nightly-s390x-2022-10-15-144437

      How reproducible:

      Moderate to Consistently 

      Steps to Reproduce:

      1. Start with a working OCP 4.11.9 Cluster.
      2. Perform an upgrade to latest OCP 4.12.x nightly build.
      3. Monitor the upgrade status:
         # oc get clusterversion
         —> will state % complete and waiting on dns - which never finishes.
         # oc get co
         —> the dns and machine-config operators will remain at 4.11.9
      4. Upgrade will never complete. 

      Actual results:

      Upgrade will never complete.

      Expected results:

      Upgrade to the targeted release succeeds.

      Additional info:

      This upgrade issue occurs for both Connected and Disconnected Clusters.

       

            dwinship@redhat.com Dan Winship
            openshift-crt-jira-prow OpenShift Prow Bot
            Florian Leber Florian Leber
            Florian Leber, Holger Wolf, Kyle Moser (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: