-
Bug
-
Resolution: Done
-
Major
-
None
-
4.13.z
-
+
-
Important
-
No
-
MCO Sprint 250, MCO Sprint 251
-
2
-
False
-
-
-
-
-
Description of problem:
While upgrading a cluster from 4.13.29 to 4.14.10, the cluster upgrade gets stuck at machine config operator. The machines config operator is in degraded state due to the failure in completing ControllerConfig i.e. waitForControllerConfigToBeCompleted fails. Based on the logs from machine config controller pod is constantly throwing warnings suggesting malformed cert.
Version-Release number of selected component (if applicable):
4.13.29
How reproducible:
Install a vSphere IPI 4.13.29 cluster and upgrade the cluster to 4.14.10
Steps to Reproduce:
1. Install a 4.13.29 cluster on vSphere using IPI 2. Upgrade the cluster to 4.14.10 3. Upgrade gets stuck at machine config operator
Actual results:
# Degraded Operator machine-config 4.13.29 False True True # Logs from Machine-config-operator pod - 2024-02-13T16:48:05.886202369Z I0213 16:48:05.886168 1 event.go:298] Event(v1.ObjectReference{Kind:"", Namespace:"openshift-machine-config-operator", Name:"machine-config", UID:"59e99d5a-4e8b-451d-9775-415c2665166f", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'MachineConfigControllerFailed' Cluster not available for [{operator 4.13.29}]: error during waitForControllerConfigToBeCompleted: [context deadline exceeded, controllerconfig is not completed: ControllerConfig has not completed: completed(false) running(true) failing(false)] # Logs from machine-config-controller pod - 2024-02-13T16:54:29.235597933Z I0213 16:54:29.235592 1 template_controller.go:500] Malformed Cert, not syncing 2024-02-13T16:54:29.235668239Z I0213 16:54:29.235621 1 template_controller.go:500] Malformed Cert, not syncing
Expected results:
Cluster should upgrade successfully to 4.14.10
Additional info:
- In order to mitigate the issue tried to delete `controllerconfigs.machineconfiguration.openshift.io` as per the KCS (https://access.redhat.com/solutions/5098731) but issue still persists. - The cluster was installed on version 4.9.11 using vSphere IPI approach with OVNKubernetes CNI.