-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.14.z
-
None
Description of problem:
The metric file_integrity_operator_node_failed is getting reset after restart of file-integrity-operator because of which existing alert NodeHasIntegrityFailure disappears.
Version-Release number of selected component (if applicable):
File Integrity Operator 1.3.4
How reproducible:
100%
Steps to Reproduce:
1. Install File Integrity Operator and check "Enable Operator Recommended Monitoring" checkbox. 2. Create FileIntegrity custom resource and wait for the pods to initialize fine. 3. Simulate a failure condition and wait for NodeHasIntegrityFailure alert to stream. 4. Restart file-integrity-operator pod. 5. After couple of minutes the alert NodeHasIntegrityFailure will clear itself but the "oc get fileintegritynodestatuses" reports failure for the node on which failure condition was simulated.
Actual results:
The metric is getting reset after operator pod restart.
Expected results:
The operator should check the status of existing fileintegritynodestatuses and trigger NodeHasIntegrityFailure alert accordingly.
Additional info:
- is duplicated by
-
OCPBUGS-33111 NodeHasIntegrityFailure alert not reported by the File Integrity operator
- Closed
- links to