Uploaded image for project: 'OpenShift Request For Enhancement'
  1. OpenShift Request For Enhancement
  2. RFE-4609

[cee.Next]Add PrometheusRule for node-exporter reporting down

XMLWordPrintable

    • False
    • None
    • False
    • Not Selected

      1. Proposed title of this feature request

      Add PrometheusRule to report alert when node-exporter is not reachable

      2. What is the nature and description of the request?

      As in RHOCP 3.11, there used to be an alert named NodeExporterDown which takes care of making sure that node metrics are getting scraped.

      3. Why does the customer need this? (List the business requirements here)

      Customer uses "oc adm top nodes" or webconsole to view metrics. Sometimes these metrics are not available and customer has no clue why these disappeared. 

      Also, there have been instances when new  nodes added to the cluster doesn't report node metrics. So, this will help in making sure that metrics are getting scraped.

      4. List any affected packages or components.

      RHOCP 4

      Additional Info:

      Below PrometheusRule should do the job. Can be added in prometheusrule named "

      node-exporter-rules":

       

          - alert: NodeExporterDown
            annotations:
              message: NodeExporter has disappeared from Prometheus target discovery.
            expr: |
              absent(up{job="node-exporter"} == 1)
            for: 15m
            labels:
              severity: critical
      

       

            rh-ee-rfloren Roger Florén
            rhn-support-dgautam Dhruv Gautam
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: