Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: 4.16.0
Affects Version/s: 4.15, 4.16
Component/s: Bare Metal Hardware Provisioning / cluster-baremetal-operator
Labels:
- component-regression
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
5
Severity:
Moderate
Regression:
No

Target Backport Versions:
None
Target Version:

4.16.0
Release Blocker:
None
Sprint:
Metal Platform 248, Metal Platform 249, Metal Platform 250
sprint_count:
3

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

I noticed this today when looking at component readiness. A ~5% decrease in instability may seem minor, but these can certainly add up. This test passed 713 times in a row on 4.14. You can see today's failure here.

Details below:

-------

Component Readiness has found a potential regression in [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers.

Probability of significant regression: 99.96%

Sample (being evaluated) Release: 4.15
Start Time: 2024-01-17T00:00:00Z
End Time: 2024-01-23T23:59:59Z
Success Rate: 94.83%
Successes: 55
Failures: 3
Flakes: 0

Base (historical) Release: 4.14
Start Time: 2023-10-04T00:00:00Z
End Time: 2023-10-31T23:59:59Z
Success Rate: 100.00%
Successes: 713
Failures: 0
Flakes: 4

View the test details report at https://sippy.dptools.openshift.org/sippy-ng/component_readiness/test_details?arch=amd64&arch=amd64&baseEndTime=2023-10-31%2023%3A59%3A59&baseRelease=4.14&baseStartTime=2023-10-04%2000%3A00%3A00&capability=Other&component=Unknown&confidence=95&environment=ovn%20upgrade-minor%20amd64%20gcp%20rt&excludeArches=arm64%2Cheterogeneous%2Cppc64le%2Cs390x&excludeClouds=openstack%2Cibmcloud%2Clibvirt%2Covirt%2Cunknown&excludeVariants=hypershift%2Cosd%2Cmicroshift%2Ctechpreview%2Csingle-node%2Cassisted%2Ccompact&groupBy=cloud%2Carch%2Cnetwork&ignoreDisruption=true&ignoreMissing=false&minFail=3&network=ovn&network=ovn&pity=5&platform=gcp&platform=gcp&sampleEndTime=2024-01-23%2023%3A59%3A59&sampleRelease=4.15&sampleStartTime=2024-01-17%2000%3A00%3A00&testId=openshift-tests-upgrade%3A37f1600d4f8d75c47fc5f575025068d2&testName=%5Bsig-cluster-lifecycle%5D%20pathological%20event%20should%20not%20see%20excessive%20Back-off%20restarting%20failed%20containers&upgrade=upgrade-minor&upgrade=upgrade-minor&variant=rt&variant=rt

blocks

OCPBUGS-29787 excessive Back-off restarting failed containers

Closed

duplicates

OCPBUGS-25766 Cluster Baremetal operator should use a leader lock

Closed

OCPBUGS-29153 Cluster Baremetal operator should use a leader lock

Closed

is cloned by

OCPBUGS-29787 excessive Back-off restarting failed containers

Closed

links to

openshift/cluster-baremetal-operator#403: OCPBUGS-27760: Update the leader election durations to be tolerant

RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update

(1 links to)

Assignee:: Jacob Anders

Reporter:: Brenton Leanhardt

QA Contact:: Jad Haj Yahya

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Created:: 2024/01/23 2:14 PM

Updated:: 2025/07/23 11:58 PM

Resolved:: 2024/06/27 11:37 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates