-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
4.14, 4.14.z
-
No
-
Rejected
-
False
-
Description of problem:
During a loaded-upgrade from 4.14.0-ec.4 to 4.14.0-0.nightly-2023-09-02-132842 upgrade fails with message: "wait has exceeded 40 minutes for these operators: network" dns pods are stuck in terminating and multus/ovnkube pods are in crashloopbackoff error
Version-Release number of selected component (if applicable):
4.14
How reproducible:
100%
Steps to Reproduce:
1. Create cluster 2. Scale cluster to 120 nodes 3. Loaded cluster using steps below git clone https://github.com/cloud-bulldozer/e2e-benchmarking.git cd workloads/kube-burner-ocp-wrapper export WORKLOAD=cluster-density-v2 export ITERATIONS=1080 ./run.sh 4. Upgrade cluster to 4.14ngihtly oc adm upgrade --to-image registry.ci.openshift.org/ocp/release:4.14.0-0.nightly-2023-09-02-132842 --force --allow-explicit-upgrade
Actual results:
Network cluster operator showing errors and not able to continue/finish the upgrade to proper version
Expected results:
All cluster operators are properly updated and cluster has no issues getting to nightly version
Additional info:
oc get co:NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE 09-06 13:10:10.574 authentication 4.14.0-0.nightly-2023-09-02-132842 True False False 107m 09-06 13:10:10.574 baremetal 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.574 cloud-controller-manager 4.14.0-0.nightly-2023-09-02-132842 True False False 4h39m 09-06 13:10:10.574 cloud-credential 4.14.0-0.nightly-2023-09-02-132842 True False False 4h39m 09-06 13:10:10.574 cluster-autoscaler 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.574 config-operator 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.574 console 4.14.0-0.nightly-2023-09-02-132842 True False False 107m 09-06 13:10:10.574 control-plane-machine-set 4.14.0-0.nightly-2023-09-02-132842 True False False 4h33m 09-06 13:10:10.574 csi-snapshot-controller 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.574 dns 4.14.0-ec.4 True True False 4h36m DNS "default" reports Progressing=True: "Have 114 available DNS pods, want 127. 09-06 13:10:10.574 Have 110 up-to-date DNS pods, want 127."... 09-06 13:10:10.574 etcd 4.14.0-0.nightly-2023-09-02-132842 True False False 4h34m 09-06 13:10:10.574 image-registry 4.14.0-0.nightly-2023-09-02-132842 True False False 4h29m 09-06 13:10:10.574 ingress 4.14.0-0.nightly-2023-09-02-132842 True False False 4h30m 09-06 13:10:10.574 insights 4.14.0-0.nightly-2023-09-02-132842 True False False 4h30m 09-06 13:10:10.574 kube-apiserver 4.14.0-0.nightly-2023-09-02-132842 True False False 4h31m 09-06 13:10:10.574 kube-controller-manager 4.14.0-0.nightly-2023-09-02-132842 True False False 4h33m 09-06 13:10:10.574 kube-scheduler 4.14.0-0.nightly-2023-09-02-132842 True False False 4h33m 09-06 13:10:10.575 kube-storage-version-migrator 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.575 machine-api 4.14.0-0.nightly-2023-09-02-132842 True False False 4h32m 09-06 13:10:10.575 machine-approver 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.575 machine-config 4.14.0-ec.4 True False False 4h34m 09-06 13:10:10.575 marketplace 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.575 monitoring 4.14.0-0.nightly-2023-09-02-132842 True False False 4h29m 09-06 13:10:10.575 network 4.14.0-0.nightly-2023-09-02-132842 True True True 4h38m DaemonSet "/openshift-multus/multus" rollout is not making progress - pod multus-fktlr is in CrashLoopBackOff State... 09-06 13:10:10.575 node-tuning 4.14.0-0.nightly-2023-09-02-132842 True False False 124m 09-06 13:10:10.575 openshift-apiserver 4.14.0-0.nightly-2023-09-02-132842 True False False 4h29m 09-06 13:10:10.575 openshift-controller-manager 4.14.0-0.nightly-2023-09-02-132842 True False False 4h32m 09-06 13:10:10.575 openshift-samples 4.14.0-0.nightly-2023-09-02-132842 True False False 126m 09-06 13:10:10.575 operator-lifecycle-manager 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.575 operator-lifecycle-manager-catalog 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.575 operator-lifecycle-manager-packageserver 4.14.0-0.nightly-2023-09-02-132842 True False False 4h30m 09-06 13:10:10.575 service-ca 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m 09-06 13:10:10.575 storage 4.14.0-0.nightly-2023-09-02-132842 True False False 4h36m