Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-74336

Add an alerts_effective_active_at_timestamp_seconds metric

XMLWordPrintable

    • None

      Create a metric that reports all the pending, firing and silenced alerts after the relabeling.
      Query Thanos and if available also alertmanager for the silences.

      Report the alert start time as the metric value.
      Name can be alerts_effective_active_at_timestamp_seconds.
      Add a rule_id fingerprint label with the alert hash, in order to be able to correlate the metric with the alert from the api call.

      Example:
      alerts_effective_active_at_timestamp_seconds{alertname=.., alertstatus=.., namespace-.., node=.., group=.., component=.., rule_id=..} 
      Perhaps additional labels like region and custom user cluster labels.

      When we query  Thanos/Prometheus rules API: GET <thanos-or-prometheus>/api/v1/rules the response nests alert instances under their parent alerting rule inside groups.
      Use this response to populate the rule_id in the metric.

              kmajcher@redhat.com Krzysztof Majcher
              sradco Shirly Radco
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: