Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.16
Component/s: Etcd
Labels:
- olmv0

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Rejected
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Several Azure-hosted OpenShift clusters are consistently triggering the CsvAbnormalFailedOver2Min alert after upgrading from 4.14 to 4.16, despite all ClusterServiceVersions and dependent operator resources being healthy. The alert appears to be false-positive, as investigation shows no actual downtime or unavailability. OLM pod logs report ComponentUnhealthy warnings for deployments that are, in fact, fully available.

Version-Release number of selected component (if applicable):

- OpenShift Container Platform: 4.16.x
- Azure-hosted clusters

How reproducible:

No

Steps to Reproduce:

    1.
    2.
    3.

Actual results:

CsvAbnormalFailedOver2Min alert fires even though:

- CSV and operator instances are in Succeeded phase
- All pods are available, running, and stable
- No real operator failure detected

Expected results:

- The alert should only fire for actual failed deployments or unavailable operators
- No alert should trigger when the CSV is fully available and healthy

Additional info:

- Issue is observed only on Azure-hosted clusters after minor upgrade from 4.14 → 4.16
- Must-gather will be attached for engineering review
- No impact noticed on workloads, operator availability, or performance
- Appears to be caused by transient conditions incorrectly interpreted as persistent failures by OLM alert logic

Assignee:: Dean West

Reporter:: Harshal Thakare

Need Info From:: None

Contributors:: None

QA Contact:: Jian Zhang

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/12/04 6:13 PM

Updated:: 2025/12/15 1:05 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates