-
Bug
-
Resolution: Obsolete
-
Major
-
None
-
4.13
-
No
-
MCO Sprint 242
-
1
-
Rejected
-
False
-
Description of problem:
While attempting to install 2430 4.13.0-rc.5 SNOs 17 failed because the machine-config operator was in degraded state. It appears the operator failed to find a specific rendered machine-config. There also appears to be another 6 clusters that failed with this and the etcd operator was also unavailable (https://issues.redhat.com/browse/OCPBUGS-12475) though the two issues appear to be entirely separate.
Version-Release number of selected component (if applicable):
Hub OCP - 4.12.10 SNO OCP - 4.13.0-rc.5 ACM - 2.8.0-DOWNSTREAM-2023-04-17-13-54-41
How reproducible:
1-.5% of installs produced this issue and it is the 2nd largest issue for SNO installs with 4.13
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
Clusterversion and clusteroperator output from 3 affected machines:
vm00064 NAME VERSION AVAILABLE PROGRESSING SINCE STATUS clusterversion.config.openshift.io/version False False 46h Error while reconciling 4.13.0-rc.5: the cluster operator machine-config is degraded NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE clusteroperator.config.openshift.io/authentication 4.13.0-rc.5 True False False 27h clusteroperator.config.openshift.io/baremetal 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/cloud-controller-manager 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/cloud-credential 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/cluster-autoscaler 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/config-operator 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/console 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/control-plane-machine-set 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/csi-snapshot-controller 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/dns 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/etcd 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/image-registry 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/ingress 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/insights 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/kube-apiserver 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/kube-controller-manager 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/kube-scheduler 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/kube-storage-version-migrator 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/machine-api 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/machine-approver 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/machine-config True True True 46h Unable to apply 4.13.0-rc.5: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool master is not ready, retrying. Status: (pool degraded: true total: 1, ready 0, updated: 0, unavailable: 1)] clusteroperator.config.openshift.io/marketplace 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/monitoring 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/network 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/node-tuning 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/openshift-apiserver 4.13.0-rc.5 True False False 22h clusteroperator.config.openshift.io/openshift-controller-manager 4.13.0-rc.5 True False False 22h clusteroperator.config.openshift.io/openshift-samples 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/operator-lifecycle-manager 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/operator-lifecycle-manager-catalog 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/operator-lifecycle-manager-packageserver 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/service-ca 4.13.0-rc.5 True False False 46h clusteroperator.config.openshift.io/storage 4.13.0-rc.5 True False False 46h vm00208 NAME VERSION AVAILABLE PROGRESSING SINCE STATUS clusterversion.config.openshift.io/version False False 47h Error while reconciling 4.13.0-rc.5: the cluster operator machine-config is degraded NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE clusteroperator.config.openshift.io/authentication 4.13.0-rc.5 True False False 28h clusteroperator.config.openshift.io/baremetal 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/cloud-controller-manager 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/cloud-credential 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/cluster-autoscaler 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/config-operator 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/console 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/control-plane-machine-set 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/csi-snapshot-controller 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/dns 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/etcd 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/image-registry 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/ingress 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/insights 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-apiserver 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-controller-manager 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-scheduler 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-storage-version-migrator 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/machine-api 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/machine-approver 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/machine-config True True True 47h Unable to apply 4.13.0-rc.5: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool master is not ready, retrying. Status: (pool degraded: true total: 1, ready 0, updated: 0, unavailable: 1)] clusteroperator.config.openshift.io/marketplace 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/monitoring 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/network 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/node-tuning 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/openshift-apiserver 4.13.0-rc.5 True False False 23h clusteroperator.config.openshift.io/openshift-controller-manager 4.13.0-rc.5 True False False 28h clusteroperator.config.openshift.io/openshift-samples 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/operator-lifecycle-manager 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/operator-lifecycle-manager-catalog 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/operator-lifecycle-manager-packageserver 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/service-ca 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/storage 4.13.0-rc.5 True False False 47h vm00244 NAME VERSION AVAILABLE PROGRESSING SINCE STATUS clusterversion.config.openshift.io/version False False 47h Error while reconciling 4.13.0-rc.5: the cluster operator machine-config is degraded NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE clusteroperator.config.openshift.io/authentication 4.13.0-rc.5 True False False 23h clusteroperator.config.openshift.io/baremetal 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/cloud-controller-manager 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/cloud-credential 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/cluster-autoscaler 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/config-operator 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/console 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/control-plane-machine-set 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/csi-snapshot-controller 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/dns 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/etcd 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/image-registry 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/ingress 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/insights 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-apiserver 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-controller-manager 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-scheduler 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/kube-storage-version-migrator 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/machine-api 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/machine-approver 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/machine-config True True True 47h Unable to apply 4.13.0-rc.5: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, error pool master is not ready, retrying. Status: (pool degraded: true total: 1, ready 0, updated: 0, unavailable: 1)] clusteroperator.config.openshift.io/marketplace 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/monitoring 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/network 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/node-tuning 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/openshift-apiserver 4.13.0-rc.5 True False False 23h clusteroperator.config.openshift.io/openshift-controller-manager 4.13.0-rc.5 True False False 28h clusteroperator.config.openshift.io/openshift-samples 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/operator-lifecycle-manager 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/operator-lifecycle-manager-catalog 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/operator-lifecycle-manager-packageserver 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/service-ca 4.13.0-rc.5 True False False 47h clusteroperator.config.openshift.io/storage 4.13.0-rc.5 True False False 47h
Log snippets from 3 of the machines too:
vm00064 I0425 18:55:16.235557 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master I0425 18:55:16.323538 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 18:55:16.414195 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master I0425 18:55:16.538787 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 18:57:05.426173 1 render_controller.go:569] BaseOSContainerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c0f2488c7d26dfe44966e870a8306e500639be45ff22a5b799192f01e2a84479 I0425 18:57:05.426232 1 render_controller.go:569] BaseOSContainerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c0f2488c7d26dfe44966e870a8306e500639be45ff22a5b799192f01e2a84479 I0425 18:57:05.426291 1 status.go:108] Degraded Machine: vm00064 and Degraded Reason: machineconfig.machineconfiguration.openshift.io "rendered-master-02da82f63661253957f9008d0fb5d5ac" not found I0425 19:03:51.423738 1 template_controller.go:137] Re-syncing ControllerConfig due to secret pull-secret change I0425 19:11:37.513258 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 19:11:37.611012 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master vm00208 I0425 18:50:22.727969 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 18:53:38.267324 1 template_controller.go:137] Re-syncing ControllerConfig due to secret pull-secret change I0425 19:00:16.668388 1 status.go:108] Degraded Machine: vm00208 and Degraded Reason: machineconfig.machineconfiguration.openshift.io "rendered-master-de56dd6f50db030b1a30ad6d7eb85128" not found I0425 19:00:16.668415 1 render_controller.go:569] BaseOSContainerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c0f2488c7d26dfe44966e870a8306e500639be45ff22a5b799192f01e2a84479 I0425 19:00:16.668389 1 render_controller.go:569] BaseOSContainerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c0f2488c7d26dfe44966e870a8306e500639be45ff22a5b799192f01e2a84479 I0425 19:02:31.643352 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master I0425 19:02:31.751885 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 19:13:57.352102 1 template_controller.go:137] Re-syncing ControllerConfig due to secret pull-secret change I0425 19:17:16.139579 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master I0425 19:17:16.237173 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker vm00244 I0425 18:53:54.586699 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 18:53:54.689809 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master I0425 18:58:25.221205 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 18:58:25.409914 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master I0425 18:59:47.286889 1 render_controller.go:569] BaseOSContainerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c0f2488c7d26dfe44966e870a8306e500639be45ff22a5b799192f01e2a84479 I0425 18:59:47.286897 1 render_controller.go:569] BaseOSContainerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c0f2488c7d26dfe44966e870a8306e500639be45ff22a5b799192f01e2a84479 I0425 18:59:47.538265 1 status.go:108] Degraded Machine: vm00244 and Degraded Reason: machineconfig.machineconfiguration.openshift.io "rendered-master-16aaaf522c743020f7ca48697606c0a8" not found I0425 19:11:09.740626 1 template_controller.go:137] Re-syncing ControllerConfig due to secret pull-secret change I0425 19:16:09.597589 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool worker I0425 19:16:09.775052 1 container_runtime_config_controller.go:888] Applied ImageConfig cluster on MachineConfigPool master