Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Blocker
Fix Version/s: CNV v4.18.1
Affects Version/s: None
Component/s: CNV Install, Upgrade and Operators
Labels:
None

Activity Type:
Quality / Stability / Reliability
Story Points:
0.42
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Component Fix Version(s):
CNV v4.99.0.rhel9-2064, CNV v4.18.0.rhel9-635
Git Pull Request:
https://github.com/kubevirt/hyperconverged-cluster-operator/pull/3242, https://github.com/kubevirt/monitoring/pull/277, https://github.com/kubevirt/hyperconverged-cluster-operator/pull/3251
Market:

Sprint:
CNV I/U Operators Sprint 264, CNV I/U Operators Sprint 266

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

While triaging failure of kubemacpooldown test, I found out that the test is failing because the metric kubevirt_hco_system_health_status that should get the value 2 because it is a critical one, get the value 3, which is not like the design of the metric, not mentioned in the documentation and the UI doesn't know to present it because the values should only 0/1/2.

Metrics should report only -  healthy (0), warning (1), or error (2)
The metric is used to asses the Operator health and causes a bug that is very visible in the also in the UI for the Operator heath in the OCP Overview page.

Version-Release number of selected component (if applicable):

How reproducible:

100%

Steps to Reproduce:

1.Scale down “cluster-network-addons-operator” deployment to zero (Otherwise, it will revert the changes on kubemacpool-mac-controller-manager)

oc -n openshift-cnv scale deployment cluster-network-addons-operator --replicas=0 

2.Scale down “kubemacpool-mac-controller-manager” deployment to zero

oc -n openshift-cnv scale deployment kubemacpool-mac-controller-manager --replicas=0

Observe the alert
check severity=critical
check operator_health_impact=critical

3.Check for kubevirt_hyperconverged_operator_health_status metric value

Actual results:

Expected results:

Additional info:

This bug affects 4.18

links to

RHEA-2025:146395 OpenShift Virtualization 4.18.1 Images

mentioned on

Merge request - Updated US source to: aacb2e5 Fix hco system health metric values (#3242)

Assignee:: Aviv Litman

Reporter:: Ohad Revah

QA Contact:: Ohad Revah

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2024/12/10 10:05 AM

Updated:: 2025/08/02 8:33 PM

Resolved:: 2025/03/25 10:36 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates