-
Feature Request
-
Resolution: Done
-
Normal
-
None
-
openshift-4.13
-
False
-
None
-
False
-
Not Selected
-
-
-
-
1. Proposed title of this feature request
Add PrometheusRule to report alert when node-exporter is not reachable
2. What is the nature and description of the request?
As in RHOCP 3.11, there used to be an alert named NodeExporterDown which takes care of making sure that node metrics are getting scraped.
3. Why does the customer need this? (List the business requirements here)
Customer uses "oc adm top nodes" or webconsole to view metrics. Sometimes these metrics are not available and customer has no clue why these disappeared.
Also, there have been instances when new nodes added to the cluster doesn't report node metrics. So, this will help in making sure that metrics are getting scraped.
4. List any affected packages or components.
RHOCP 4
Additional Info:
Below PrometheusRule should do the job. Can be added in prometheusrule named "
node-exporter-rules":
- alert: NodeExporterDown annotations: message: NodeExporter has disappeared from Prometheus target discovery. expr: | absent(up{job="node-exporter"} == 1) for: 15m labels: severity: critical