-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
4.13.0
-
None
-
Important
-
No
-
False
-
Problem:
NodeHasIntegrityFailure not firing due to the file-integrity-operator not exposing the FileIntegrity-related metrics to Prometheus
~~~
$ oc get prometheusrule file-integrity -o yaml -n openshift-file-integrity
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
creationTimestamp: "2024-04-05T02:38:22Z"
generation: 1
name: file-integrity
namespace: openshift-file-integrity
resourceVersion: "44540"
uid: 4acc4710-3e4d-40b1-b3f7-336d200eab32
spec:
groups:
- name: node-failed
rules:
alert: NodeHasIntegrityFailure
annotations:
description: Node {{ $labels.node }} has an integrity check status of Failed
for more than 1 second.
summary: Node {{ $labels.node }} has a file integrity failure
expr: file_integrity_operator_node_failed {node=~".+"}
on(node) kube_node_info
>0
for: 1s
labels:
namespace: openshift-file-integrity
severity: warning
~~~~
Fix to report the ` NodeHasIntegrityFailure` alert:
Add `cluster monitoring` to `fileintegrity` namespace so that FileIntegrity-related metrics is exposed to Prometheus.
~~~
$ oc label namespace openshift-file-integrity
openshift.io/cluster-monitoring=true
namespace/openshift-file-integrity labeled
~~~
By default, the cluster monitoring should be added to the `file integrity` namespace so that FileIntegrity-related metricsĀ is exposed to Prometheus.
Documentation Link :
https://github.com/openshift/file-integrity-operator
- duplicates
-
OCPBUGS-42807 Metric file_integrity_operator_node_failed of file integrity operator reset after pod restart
- MODIFIED