Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.14.z
Component/s: File Integrity Operator
Labels:
None

Severity:
Critical
Regression:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Target Version:

4.18

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Review Complete:

Description of problem:

The metric file_integrity_operator_node_failed is getting reset after restart of file-integrity-operator because of which existing alert NodeHasIntegrityFailure disappears.

Version-Release number of selected component (if applicable):

File Integrity Operator 1.3.4

How reproducible:

100%

Steps to Reproduce:

1. Install File Integrity Operator and check "Enable Operator Recommended Monitoring" checkbox.
2. Create FileIntegrity custom resource and wait for the pods to initialize fine.
3. Simulate a failure condition and wait for NodeHasIntegrityFailure alert to stream.
4. Restart file-integrity-operator pod.
5. After couple of minutes the alert NodeHasIntegrityFailure will clear itself but the "oc get fileintegritynodestatuses" reports failure for the node on which failure condition was simulated.

Actual results:

The metric is getting reset after operator pod restart.

Expected results:

The operator should check the status of existing fileintegritynodestatuses and trigger NodeHasIntegrityFailure alert accordingly.

Additional info:

is duplicated by

OCPBUGS-33111 NodeHasIntegrityFailure alert not reported by the File Integrity operator

Closed

links to

KCS

openshift/file-integrity-operator#584: OCPBUGS-42807: Preserve metrics when pod crashes

Assignee:: Vincent Shen

Reporter:: Dhruv Gautam

QA Contact:: Xiaojie Yuan

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2024/10/07 12:51 PM

Updated:: 2025/01/06 4:52 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates