Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-42807

Metric file_integrity_operator_node_failed of file integrity operator reset after pod restart

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.14.z
    • None
    • Critical
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      The metric file_integrity_operator_node_failed is getting reset after restart of file-integrity-operator because of which existing alert NodeHasIntegrityFailure disappears.

      Version-Release number of selected component (if applicable):

      File Integrity Operator 1.3.4

      How reproducible:

      100%

      Steps to Reproduce:

      1. Install File Integrity Operator and check "Enable Operator Recommended Monitoring" checkbox.
      2. Create FileIntegrity custom resource and wait for the pods to initialize fine.
      3. Simulate a failure condition and wait for NodeHasIntegrityFailure alert to stream.
      4. Restart file-integrity-operator pod.
      5. After couple of minutes the alert NodeHasIntegrityFailure will clear itself but the "oc get fileintegritynodestatuses" reports failure for the node on which failure condition was simulated.

      Actual results:

      The metric is getting reset after operator pod restart.

      Expected results:

      The operator should check the status of existing fileintegritynodestatuses and trigger NodeHasIntegrityFailure alert accordingly.

      Additional info:

          

              wenshen@redhat.com Vincent Shen
              rhn-support-dgautam Dhruv Gautam
              Xiaojie Yuan Xiaojie Yuan
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: