Description of problem:
CO olm Degraded.
jiazha-mac:~ jiazha$ omg get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 1h7m Unable to apply 4.19.0-0.nightly-multi-2025-02-26-050012: the cluster operator olm is not available jiazha-mac:~ jiazha$ omg get co olm -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: ... spec: {} status: conditions: - lastTransitionTime: '2025-02-26T16:25:34Z' message: 'CatalogdDeploymentCatalogdControllerManagerDegraded: Deployment was progressing too long OperatorcontrollerDeploymentOperatorControllerControllerManagerDegraded: Deployment was progressing too long' reason: CatalogdDeploymentCatalogdControllerManager_SyncError::OperatorcontrollerDeploymentOperatorControllerControllerManager_SyncError status: 'True' type: Degraded - lastTransitionTime: '2025-02-26T16:08:34Z' message: 'CatalogdDeploymentCatalogdControllerManagerProgressing: Waiting for Deployment to deploy pods OperatorcontrollerDeploymentOperatorControllerControllerManagerProgressing: Waiting for Deployment to deploy pods' reason: CatalogdDeploymentCatalogdControllerManager_Deploying::OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status: 'True' type: Progressing - lastTransitionTime: '2025-02-26T16:08:34Z' message: 'CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment' reason: CatalogdDeploymentCatalogdControllerManager_Deploying::OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status: 'False' type: Available
However, the `catalogd` and `operator-controller` deployment worked well at that time.
jiazha-mac:~ jiazha$ omg get deploy NAME READY UP-TO-DATE AVAILABLE AGE catalogd-controller-manager 1/1 1 1 1h1m jiazha-mac:~ jiazha$ omg get deploy -n openshift-operator-controller NAME READY UP-TO-DATE AVAILABLE AGE operator-controller-controller-manager 1/1 1 1 1h1m jiazha-mac:~ jiazha$ omg get deploy catalogd-controller-manager -o yaml apiVersion: apps/v1 kind: Deployment ... status: availableReplicas: '1' conditions: - lastTransitionTime: '2025-02-26T16:24:35Z' lastUpdateTime: '2025-02-26T16:24:35Z' message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: 'True' type: Available - lastTransitionTime: '2025-02-26T16:22:42Z' lastUpdateTime: '2025-02-26T16:24:35Z' message: ReplicaSet "catalogd-controller-manager-7f855d8d48" has successfully progressed. reason: NewReplicaSetAvailable status: 'True' type: Progressing observedGeneration: '1' readyReplicas: '1' replicas: '1' updatedReplicas: '1' jiazha-mac:~ jiazha$ omg get deploy -n openshift-operator-controller operator-controller-controller-manager -o yaml apiVersion: apps/v1 kind: Deployment ... status: availableReplicas: '1' conditions: - lastTransitionTime: '2025-02-26T16:23:49Z' lastUpdateTime: '2025-02-26T16:23:49Z' message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: 'True' type: Available - lastTransitionTime: '2025-02-26T16:22:54Z' lastUpdateTime: '2025-02-26T16:23:49Z' message: ReplicaSet "operator-controller-controller-manager-57f648fb64" has successfully progressed. reason: NewReplicaSetAvailable status: 'True' type: Progressing observedGeneration: '1' readyReplicas: '1' replicas: '1' updatedReplicas: '1'
Version-Release number of selected component (if applicable):
How reproducible:
Not always
Steps to Reproduce:
encountered this issues twice:
1. 2. 3.
Actual results:
CO olm Degraded.
Expected results:
CO olm availabel.
Additional info:
jiazha-mac:~ jiazha$ omg project openshift-cluster-olm-operator Now using project openshift-cluster-olm-operator jiazha-mac:~ jiazha$ omg get pods NAME READY STATUS RESTARTS AGE cluster-olm-operator-5c6b8c4959-swxtt 0/1 Running 0 38m jiazha-mac:~ jiazha$ omg logs cluster-olm-operator-5c6b8c4959-swxtt -c cluster-olm-operator 2025-02-26T16:31:52.648371813Z I0226 16:31:52.643085 1 cmd.go:253] Using service-serving-cert provided certificates 2025-02-26T16:31:52.648662533Z I0226 16:31:52.648619 1 leaderelection.go:121] The leader election gives 4 retries and allows for 30s of clock skew. The kube-apiserver downtime tolerance is 78s. Worst non-graceful lease acquisition is 2m43s. Worst graceful lease acquisition is {26s}. ... 2025-02-26T16:32:05.467351366Z E0226 16:32:05.467298 1 base_controller.go:279] "Unhandled Error" err="CatalogdDeploymentCatalogdControllerManager reconciliation failed: Deployment was progressing too long" 2025-02-26T16:32:06.059681614Z I0226 16:32:06.059629 1 builder.go:224] "ProxyHook updating environment" logger="builder" deployment="operator-controller-controller-manager" 2025-02-26T16:32:06.059769494Z I0226 16:32:06.059758 1 featuregates_hook.go:33] "updating environment" logger="feature_gates_hook" deployment="operator-controller-controller-manager" 2025-02-26T16:32:06.066149493Z E0226 16:32:06.066095 1 base_controller.go:279] "Unhandled Error" err="OperatorcontrollerDeploymentOperatorControllerControllerManager reconciliation failed: Deployment was progressing too long"
- links to
-
RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update