-
Bug
-
Resolution: Done
-
Major
-
None
-
4.13
Description of problem:
Prow jobs upgrading from 4.9 to 4.16 are failing when they upgrade from 4.12 to 4.13. Nodes become NotReady when MCO tries to apply the new 4.13 configuration to the MCPs. The failing job is: periodic-ci-openshift-openshift-tests-private-release-4.16-amd64-nightly-4.16-upgrade-from-stable-4.9-azure-ipi-f28 We have reproduced the issue and we found an ordering cycle error in the journal log Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 systemd-journald.service[838]: Runtime Journal (/run/log/journal/960b04f10e4f44d98453ce5faae27e84) is 8.0M, max 641.9M, 633.9M free. Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: machine-config-daemon-pull.service: Found ordering cycle on network-online.target/start Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: machine-config-daemon-pull.service: Found dependency on node-valid-hostname.service/start Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: machine-config-daemon-pull.service: Found dependency on ovs-configuration.service/start Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: machine-config-daemon-pull.service: Found dependency on firstboot-osupdate.target/start Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: machine-config-daemon-pull.service: Found dependency on machine-config-daemon-firstboot.service/start Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: machine-config-daemon-pull.service: Found dependency on machine-config-daemon-pull.service/start Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: machine-config-daemon-pull.service: Job network-online.target/start deleted to break ordering cycle starting with machine-config-daemon-pull.service/start Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: Queued start job for default target Graphical Interface. Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: systemd-journald.service: unit configures an IP firewall, but the local system does not support BPF/cgroup firewalling. Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: (This warning is only shown for the first unit using IP firewalling.) Wed 2024-07-24 21:12:17 UTC ci-op-g94jvswm-cc71e-998q8-master-2 init.scope[1]: systemd-journald.service: Deactivated successfully.
Version-Release number of selected component (if applicable):
Using IPI on Azure, these are the version involved in the current issue upgrading from 4.9 to 4.13: version: 4.13.0-0.nightly-2024-07-23-154444 version: 4.12.0-0.nightly-2024-07-23-230744 version: 4.11.59 version: 4.10.67 version: 4.9.59
How reproducible:
Always
Steps to Reproduce:
1. Upgrade an IPI on Azure cluster from 4.9 to 4.13. Theoretically, upgrading from 4.12 to 4.13 should be enough, but we reproduced it following the whole path.
Actual results:
Nodes become not ready $ oc get nodes NAME STATUS ROLES AGE VERSION ci-op-g94jvswm-cc71e-998q8-master-0 Ready master 6h14m v1.25.16+306a47e ci-op-g94jvswm-cc71e-998q8-master-1 Ready master 6h13m v1.25.16+306a47e ci-op-g94jvswm-cc71e-998q8-master-2 NotReady,SchedulingDisabled master 6h13m v1.25.16+306a47e ci-op-g94jvswm-cc71e-998q8-worker-centralus1-c7ngb NotReady,SchedulingDisabled worker 6h2m v1.25.16+306a47e ci-op-g94jvswm-cc71e-998q8-worker-centralus2-2ppf6 Ready worker 6h4m v1.25.16+306a47e ci-op-g94jvswm-cc71e-998q8-worker-centralus3-nqshj Ready worker 6h6m v1.25.16+306a47e And in the NotReady nodes we can see the ordering cycle error mentioned in the description of this ticket.
Expected results:
No ordering cycle error should happen and the upgrade should be executed without problems.
Additional info:
- blocks
-
OCPBUGS-38370 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38371 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38372 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38373 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38374 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
- is blocked by
-
MCO-1257 Impact statement request for OCPBUGS-37534 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
- is cloned by
-
OCPBUGS-38370 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38371 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38372 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38373 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
-
OCPBUGS-38374 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Closed
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update
- mentioned on