-
Bug
-
Resolution: Done
-
Critical
-
4.12
-
Important
-
None
-
Approved
-
False
-
This is a clone of issue OCPBUGS-2532. The following is the description of the original issue:
—
Description of problem:
Upgrades from OCP 4.11.9 to the latest OCP 4.12 Nightly builds including 4.12.0-ec.4 will fail. When the upgrade fails, there are typically two operators that never get upgraded(all others do upgrade to the targeted 4.12.x release): dns 4.11.9 True True False 11h DNS "default" reports Progressing=True: "Have 4 available DNS pods, want 5."... machine-config 4.11.9 True False False 14h The dns.operator details state it is waiting for a 4/5 pods to become available: # oc describe dns.operator/default ... Status: Cluster Domain: cluster.local Cluster IP: 172.30.0.10 Conditions: Last Transition Time: 2022-10-18T03:21:44Z Message: Enough DNS pods are available, and the DNS service has a cluster IP address. Reason: AsExpected Status: False Type: Degraded Last Transition Time: 2022-10-18T03:21:44Z Message: Have 4 available DNS pods, want 5. Reason: Reconciling Status: True Type: Progressing The mcp reports everything is good: # oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-87fd457ffdaf49d75e62b532c22a9f1d True False False 3 3 3 0 14h worker rendered-worker-7fc68009b1facf8724cd952cb08435ff True False False 2 2 2 0 14h We have performed a large number of the same upgrades, using the same configuration, and while there are times the upgrade succeeds, the large number of results do fail. This seems to be a timing issue. As a current workaround, if we were to recycle the control plane nodes, the upgrade will complete successfully. A must-gather log is attached for review.
Version-Release number of selected component (if applicable):
Tested upgrading to all the following releases: 4.12.0-ec.4 4.12.0-0.nightly-s390x-2022-10-10-005931 4.12.0-0.nightly-s390x-2022-10-15-144437
How reproducible:
Moderate to Consistently
Steps to Reproduce:
1. Start with a working OCP 4.11.9 Cluster. 2. Perform an upgrade to latest OCP 4.12.x nightly build. 3. Monitor the upgrade status: # oc get clusterversion —> will state % complete and waiting on dns - which never finishes. # oc get co —> the dns and machine-config operators will remain at 4.11.9 4. Upgrade will never complete.
Actual results:
Upgrade will never complete.
Expected results:
Upgrade to the targeted release succeeds.
Additional info:
This upgrade issue occurs for both Connected and Disconnected Clusters.