-
Story
-
Resolution: Unresolved
-
Major
-
None
Create a metric that reports all the pending, firing and silenced alerts after the relabeling.
Query Thanos and if available also alertmanager for the silences.
Report the alert start time as the metric value.
Name can be alerts_effective_active_at_timestamp_seconds.
Add a rule_id fingerprint label with the alert hash, in order to be able to correlate the metric with the alert from the api call.
Example:
alerts_effective_active_at_timestamp_seconds{alertname=.., alertstatus=.., namespace-.., node=.., group=.., component=.., rule_id=..}
Perhaps additional labels like region and custom user cluster labels.
When we query Thanos/Prometheus rules API: GET <thanos-or-prometheus>/api/v1/rules the response nests alert instances under their parent alerting rule inside groups.
Use this response to populate the rule_id in the metric.
- is cloned by
-
CNV-76347 Add design document for alerts_effective_active_at_timestamp_seconds metric
-
- In Progress
-