-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.22
Summary
Since Feb 12, the capi-controllers and capi-operator pods crash-loop on every run of periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-ipv6-techpreview. This causes a cascade of failures:
- ClusterOperatorDegraded alert fires for ~1.5h
- Pathological Back-off restarting events detected
- Cluster stability checks fail
- Operators reported as not on the cluster version
18 consecutive failures since Feb 12 with 0 passes. The last passing run was Feb 10.
This is NOT a recurrence of the same code change as OCPBUGS-74161 (cluster-api-provider-metal3#57 was reverted Jan 21 and no new PRs have merged in that repo since). The root cause is different.
Error Output
All failures show the same pattern:
32 events happened too frequently event [namespace/openshift-cluster-api pod/capi-controllers-* - Back-off restarting failed container] event [namespace/openshift-cluster-api-operator pod/capi-operator-* - Back-off restarting failed container]
ClusterOperatorDegraded alert:
ClusterOperatorDegraded was at or above info for at least 1h29m on metal/amd64/ovn/ha (maxAllowed=1s): firing for 1h29m
Affected Variant
- Platform: metal
- Network: ovn (IPv6)
- FeatureSet: techpreview
- Topology: ha
- Architecture: amd64
- Installer: ipi
- Upgrade: none
Regression Details
| Regression ID | Test Name | Max Failures |
|---|---|---|
| 35504 | pathological event: excessive Back-off restarting | 17 |
| 35506 | verify the cluster readiness and stability | 23 |
| 35503 | operators on the cluster version | 17 |
| 35500 | ClusterOperatorDegraded alert | 18 |
| 35482 | container restarts in ns/openshift-cluster-api | open |
| 35502 | container restarts in ns/openshift-cluster-api-operator | open |
Timeline
- Last passing run: 2026-02-10T20:21:33Z
- First failing run: 2026-02-12T10:58:08Z
- First failing payload: 4.22.0-0.nightly-2026-02-12-105448
- Regression opened in Component Readiness: 2026-02-14 to 2026-02-15
Payload Analysis
Payload 4.22.0-0.nightly-2026-02-12-105448 contained 9 new PRs. None directly modify cluster-api-provider-metal3. No new PRs have been merged in openshift/cluster-api-provider-metal3 since the Jan 21 revert (PR #60).
Related
- OCPBUGS-74161 - Previous occurrence caused by cluster-api-provider-metal3#57 (reverted). This is a different root cause.
- Sample failed run with 32 pathological events