-
Bug
-
Resolution: Done
-
Normal
-
None
-
4.19.0
-
Quality / Stability / Reliability
-
False
-
-
1
-
Critical
-
None
-
None
-
None
-
Rejected
-
Metal Platform 270
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Component Readiness has found a potential regression in the following test:
periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade
I analyzed 5 jobs and all had issues with nodes not coming up, and machine api not working properly (level=error msg=Cluster operator cluster-autoscaler Degraded is True with MissingDependency: machine-api not ready)
1. https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915807119978795008:
Two workers were "red" without clear failures, but all three working masters had message:
"Drain operation currently blocked by: [\{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]" and CSRs failed to get approved.
2. https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915775736111697920
One machine failed/red with "Insufficient resources"
name: worker-user-data-managed
status:
conditions:
- lastTransitionTime: "2025-04-25T15:17:33Z"
status: "True"
type: Drainable
- lastTransitionTime: "2025-04-25T15:17:33Z
message: Instance has not been created reason: InstanceNotCreated severity: Warning
status: "False"
type: InstanceExists
- lastTransitionTime: "2025-04-25T15:17:33Z"
status: "True"
type: Terminable
errorMessage: No available BareMetalHost found
errorReason: InsufficientResources
lastUpdated: "2025-04-25T15:17:33Z"
phase: Provisioning
3. https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915379017452621824
had node issue with symptoms: CSRs, six failed to get "approved". Two workers were "red" without clear failures, but all three working masters had same message as example 1:
{{message: "Drain operation currently blocked by: [\{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]" }}
reason: HookPresent
severity: Warning
status: "False"
{{type: Drainable}}
had one of two workers fail, with all masters having the "Drain operation currently blocked" error same as example 1 and 3, and had the CSR issues with 2 CSRs that never got approved.
was the first job (Apr 23 11:45:12 (UTC-4) that had the worker failure when masters could not drain, and also unapproved CSR issues like 1, 3, and 4. on
Significant regression detected.
Fishers Exact probability of a regression: 100.00%.
Test pass rate dropped from 98.42% to 84.38%.
Sample (being evaluated) Release: 4.19
Start Time: 2025-04-18T00:00:00Z
End Time: 2025-04-25T20:00:00Z
Success Rate: 84.38%
Successes: 54
Failures: 10
Flakes: 0