-
Task
-
Resolution: Obsolete
-
Undefined
-
None
-
None
As discussed here we silenced the ThanosRuleRuleEvaluationLatencyHigh alert.
Now in the scope of allowing tenants to define their own recording rules, we should revisit the alert, having in mind that:
- Now users have the option to configure recording rules, allowing them to define rules of any length/duration.
- Rules can take ages to evaluate - which can bring this alert, in case not silenced anymore, to fire more often
- How do we manage alerts in general, in this context?
Link to the alert silence: https://alertmanager.telemeter-prod-01.devshift.net/#/silences/912d846d-4d04-4354-a59e-47287ecb564b
Alert silence was extended until 01-05-2022 since we didn't get to this issue yet. New alert silence link: https://alertmanager.telemeter-prod-01.devshift.net/#/silences/c81dd6a6-5bd3-416f-855b-e8b74ff0757f
A/C
- A decision is made if we keep the alert or not. If we keep it, we should not silence it anymore and update any relevant links (e.g. runbooks/dashboards)