-
Bug
-
Resolution: Duplicate
-
Normal
-
None
-
4.13
-
No
-
False
-
Description of problem:
Machine config in degraded state after upgrade to 4.13 nightly version
Version-Release number of selected component (if applicable):
Installation: 4.11.36 First upgrade: 4.12.13 Second upgrade: 4.13.0-0.nightly-2023-04-18-005127
How reproducible:
1 failed on 1 try.
Steps to Reproduce:
1. Install 4.11.36 AWS - OVN - Customer VPC jenkins: https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/197104/ 2. Scale up to 25 working nodes (https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/cluster-workers-scaling/2636/) 3. Add 25 RHEL8 nodes (https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp4-rhel-scaleup/19587/) 4. Load cluster with projects (4x50 worker nodes) (https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/kube-burner/2502/) 5. Upgrade to 4.12.13 version (https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/scale-ci/job/e2e-benchmarking-multibranch-pipeline/job/upgrade/595/)
Actual results:
one node in NotReady state: ip-10-0-52-46.us-east-2.compute.internal NotReady,SchedulingDisabled worker 8h v1.24.12+ceaf338 10.0.52.46 <none> Red Hat Enterprise Linux 8.4 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.24.5-2.rhaos4.11.gitb007cb6.el8 event: Warning ErrorReconcilingNode 8h (x9 over 8h) controlplane [k8s.ovn.org/node-chassis-id annotation not found for node ip-10-0-52-46.us-east-2.compute.internal, macAddress annotation not found for node "ip-10-0-52-46.us-east-2.compute.internal" , k8s.ovn.org/l3-gateway-config annotation not found for node "ip-10-0-52-46.us-east-2.compute.internal"] machine-config in degraded state: - lastTransitionTime: '2023-04-18T21:23:39Z' message: 'Failed to resync 4.13.0-0.nightly-2023-04-18-005127 because: failed to apply machine config daemon manifests: error during waitForDaemonsetRollout: [timed out waiting for the condition, daemonset machine-config-daemon is not ready. status: (desired: 57, updated: 57, ready: 56, unavailable: 1)]' reason: MachineConfigDaemonFailed status: 'True' type: Degraded
Expected results:
All nodes should be available after upgrade.