Description of problem:
CO olm Degraded.
jiazha-mac:~ jiazha$ omg get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version False True 1h7m Unable to apply 4.19.0-0.nightly-multi-2025-02-26-050012: the cluster operator olm is not available
jiazha-mac:~ jiazha$ omg get co olm -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
...
spec: {}
status:
conditions:
- lastTransitionTime: '2025-02-26T16:25:34Z'
message: 'CatalogdDeploymentCatalogdControllerManagerDegraded: Deployment was
progressing too long
OperatorcontrollerDeploymentOperatorControllerControllerManagerDegraded: Deployment
was progressing too long'
reason: CatalogdDeploymentCatalogdControllerManager_SyncError::OperatorcontrollerDeploymentOperatorControllerControllerManager_SyncError
status: 'True'
type: Degraded
- lastTransitionTime: '2025-02-26T16:08:34Z'
message: 'CatalogdDeploymentCatalogdControllerManagerProgressing: Waiting for
Deployment to deploy pods
OperatorcontrollerDeploymentOperatorControllerControllerManagerProgressing:
Waiting for Deployment to deploy pods'
reason: CatalogdDeploymentCatalogdControllerManager_Deploying::OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying
status: 'True'
type: Progressing
- lastTransitionTime: '2025-02-26T16:08:34Z'
message: 'CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment
OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting
for Deployment'
reason: CatalogdDeploymentCatalogdControllerManager_Deploying::OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying
status: 'False'
type: Available
However, the `catalogd` and `operator-controller` deployment worked well at that time.
jiazha-mac:~ jiazha$ omg get deploy NAME READY UP-TO-DATE AVAILABLE AGE catalogd-controller-manager 1/1 1 1 1h1m jiazha-mac:~ jiazha$ omg get deploy -n openshift-operator-controller NAME READY UP-TO-DATE AVAILABLE AGE operator-controller-controller-manager 1/1 1 1 1h1m jiazha-mac:~ jiazha$ omg get deploy catalogd-controller-manager -o yaml apiVersion: apps/v1 kind: Deployment ... status: availableReplicas: '1' conditions: - lastTransitionTime: '2025-02-26T16:24:35Z' lastUpdateTime: '2025-02-26T16:24:35Z' message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: 'True' type: Available - lastTransitionTime: '2025-02-26T16:22:42Z' lastUpdateTime: '2025-02-26T16:24:35Z' message: ReplicaSet "catalogd-controller-manager-7f855d8d48" has successfully progressed. reason: NewReplicaSetAvailable status: 'True' type: Progressing observedGeneration: '1' readyReplicas: '1' replicas: '1' updatedReplicas: '1' jiazha-mac:~ jiazha$ omg get deploy -n openshift-operator-controller operator-controller-controller-manager -o yaml apiVersion: apps/v1 kind: Deployment ... status: availableReplicas: '1' conditions: - lastTransitionTime: '2025-02-26T16:23:49Z' lastUpdateTime: '2025-02-26T16:23:49Z' message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: 'True' type: Available - lastTransitionTime: '2025-02-26T16:22:54Z' lastUpdateTime: '2025-02-26T16:23:49Z' message: ReplicaSet "operator-controller-controller-manager-57f648fb64" has successfully progressed. reason: NewReplicaSetAvailable status: 'True' type: Progressing observedGeneration: '1' readyReplicas: '1' replicas: '1' updatedReplicas: '1'
Version-Release number of selected component (if applicable):
How reproducible:
Not always
Steps to Reproduce:
encountered this issues twice:
1.
2.
3.
Actual results:
CO olm Degraded.
Expected results:
CO olm availabel.
Additional info:
jiazha-mac:~ jiazha$ omg project openshift-cluster-olm-operator
Now using project openshift-cluster-olm-operator
jiazha-mac:~ jiazha$ omg get pods
NAME READY STATUS RESTARTS AGE
cluster-olm-operator-5c6b8c4959-swxtt 0/1 Running 0 38m
jiazha-mac:~ jiazha$ omg logs cluster-olm-operator-5c6b8c4959-swxtt -c cluster-olm-operator
2025-02-26T16:31:52.648371813Z I0226 16:31:52.643085 1 cmd.go:253] Using service-serving-cert provided certificates
2025-02-26T16:31:52.648662533Z I0226 16:31:52.648619 1 leaderelection.go:121] The leader election gives 4 retries and allows for 30s of clock skew. The kube-apiserver downtime tolerance is 78s. Worst non-graceful lease acquisition is 2m43s. Worst graceful lease acquisition is {26s}.
...
2025-02-26T16:32:05.467351366Z E0226 16:32:05.467298 1 base_controller.go:279] "Unhandled Error" err="CatalogdDeploymentCatalogdControllerManager reconciliation failed: Deployment was progressing too long"
2025-02-26T16:32:06.059681614Z I0226 16:32:06.059629 1 builder.go:224] "ProxyHook updating environment" logger="builder" deployment="operator-controller-controller-manager"
2025-02-26T16:32:06.059769494Z I0226 16:32:06.059758 1 featuregates_hook.go:33] "updating environment" logger="feature_gates_hook" deployment="operator-controller-controller-manager"
2025-02-26T16:32:06.066149493Z E0226 16:32:06.066095 1 base_controller.go:279] "Unhandled Error" err="OperatorcontrollerDeploymentOperatorControllerControllerManager reconciliation failed: Deployment was progressing too long"
- links to
-
RHEA-2024:11038
OpenShift Container Platform 4.19.z bug fix update