Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.17, 4.18.0
Component/s: Cloud Compute / OpenStack Provider
Labels:

Test Coverage:

+
Severity:
Critical
Regression:
Yes
Sprint:
ShiftStack Sprint 259, ShiftStack Sprint 260
sprint_count:
2
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Type:
Release Note Not Required
Release Note Status:
In Progress
Target Version:

4.18.0
Target Backport Versions:

4.17.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

    sometimes cluster-capi-operator pod stuck in CrashLoopBackOff on osp

Version-Release number of selected component (if applicable):

4.17.0-0.nightly-2024-08-01-213905

How reproducible:

    Sometimes

Steps to Reproduce:

    1.Create an osp cluster with TechPreviewNoUpgrade
    2.Check cluster-capi-operator pod
    3.

Actual results:

cluster-capi-operator pod in CrashLoopBackOff status
$ oc get po                               
cluster-capi-operator-74dfcfcb9d-7gk98          0/1     CrashLoopBackOff   6 (2m54s ago)   41m

$ oc get po         
cluster-capi-operator-74dfcfcb9d-7gk98          1/1     Running   7 (7m52s ago)   46m

$ oc get po                                                               
cluster-capi-operator-74dfcfcb9d-7gk98          0/1     CrashLoopBackOff   7 (2m24s ago)   50m

E0806 03:44:00.584669       1 kind.go:66] "kind must be registered to the Scheme" err="no kind is registered for the type v1alpha7.OpenStackCluster in scheme \"github.com/openshift/cluster-capi-operator/cmd/cluster-capi-operator/main.go:86\"" logger="controller-runtime.source.EventHandler"
E0806 03:44:00.685539       1 controller.go:203] "Could not wait for Cache to sync" err="failed to wait for clusteroperator caches to sync: timed out waiting for cache to be synced for Kind *v1alpha7.OpenStackCluster" controller="clusteroperator" controllerGroup="config.openshift.io" controllerKind="ClusterOperator"
I0806 03:44:00.685610       1 internal.go:516] "Stopping and waiting for non leader election runnables"
I0806 03:44:00.685620       1 internal.go:520] "Stopping and waiting for leader election runnables"
I0806 03:44:00.685646       1 controller.go:240] "Shutdown signal received, waiting for all workers to finish" controller="secret" controllerGroup="" controllerKind="Secret"
I0806 03:44:00.685706       1 controller.go:240] "Shutdown signal received, waiting for all workers to finish" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster"
I0806 03:44:00.685712       1 controller.go:242] "All workers finished" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster"
I0806 03:44:00.685717       1 controller.go:240] "Shutdown signal received, waiting for all workers to finish" controller="secret" controllerGroup="" controllerKind="Secret"
I0806 03:44:00.685722       1 controller.go:242] "All workers finished" controller="secret" controllerGroup="" controllerKind="Secret"
I0806 03:44:00.685718       1 controller.go:242] "All workers finished" controller="secret" controllerGroup="" controllerKind="Secret"
I0806 03:44:00.685720       1 controller.go:240] "Shutdown signal received, waiting for all workers to finish" controller="clusteroperator" controllerGroup="config.openshift.io" controllerKind="ClusterOperator"
I0806 03:44:00.685823       1 recorder_in_memory.go:80] &Event{ObjectMeta:{dummy.17e906d425f7b2e1  dummy    0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},InvolvedObject:ObjectReference{Kind:Pod,Namespace:dummy,Name:dummy,UID:,APIVersion:v1,ResourceVersion:,FieldPath:,},Reason:CustomResourceDefinitionUpdateFailed,Message:Failed to update CustomResourceDefinition.apiextensions.k8s.io/openstackclusters.infrastructure.cluster.x-k8s.io: Put "https://172.30.0.1:443/apis/apiextensions.k8s.io/v1/customresourcedefinitions/openstackclusters.infrastructure.cluster.x-k8s.io": context canceled,Source:EventSource{Component:cluster-capi-operator-capi-installer-apply-client,Host:,},FirstTimestamp:2024-08-06 03:44:00.685748961 +0000 UTC m=+302.946052179,LastTimestamp:2024-08-06 03:44:00.685748961 +0000 UTC m=+302.946052179,Count:1,Type:Warning,EventTime:0001-01-01 00:00:00 +0000 UTC,Series:nil,Action:,Related:nil,ReportingController:,ReportingInstance:,}
I0806 03:44:00.719743       1 capi_installer_controller.go:309] "CAPI Installer Controller is Degraded" logger="CapiInstallerController" controller="clusteroperator" controllerGroup="config.openshift.io" controllerKind="ClusterOperator" ClusterOperator="cluster-api" namespace="" name="cluster-api" reconcileID="6fa96361-4dc2-4865-b1b3-f92378c002cc"
E0806 03:44:00.719942       1 controller.go:329] "Reconciler error" err="error during reconcile: failed to set conditions for CAPI Installer controller: failed to sync status: failed to update cluster operator status: client rate limiter Wait returned an error: context canceled" controller="clusteroperator" controllerGroup="config.openshift.io" controllerKind="ClusterOperator" ClusterOperator="cluster-api" namespace="" name="cluster-api" reconcileID="6fa96361-4dc2-4865-b1b3-f92378c002cc"

Expected results:

    cluster-capi-operator pod is always Running

Additional info:

is depended on by

OCPBUGS-41576 [capi] sometimes cluster-capi-operator pod stuck in CrashLoopBackOff on osp

Closed

links to

openshift/cluster-capi-operator#203: OCPBUGS-38006: Use CAPO v0.10 and API v1beta1

RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update

Assignee:: Martin André

Reporter:: Zhaohua Sun

QA Contact:: Itshak Brown

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2024/08/06 4:31 AM

Updated:: 2025/02/25 4:45 AM

Resolved:: 2025/02/25 4:45 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates