Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-1922

Revisit ThanosRuleRuleEvaluationLatencyHigh alert

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Obsolete
    • Icon: Undefined Undefined
    • None
    • None
    • Observatorium

      As discussed here we silenced the ThanosRuleRuleEvaluationLatencyHigh alert.

      Now in the scope of allowing tenants to define their own recording rules, we should revisit the alert, having in mind that:

      • Now users have the option to configure recording rules, allowing them to define rules of any length/duration.
      • Rules can take ages to evaluate - which can bring this alert, in case not silenced anymore, to fire more often
      • How do we manage alerts in general, in this context?

      Link to the alert silence: https://alertmanager.telemeter-prod-01.devshift.net/#/silences/912d846d-4d04-4354-a59e-47287ecb564b
      Alert silence was extended until 01-05-2022 since we didn't get to this issue yet. New alert silence link: https://alertmanager.telemeter-prod-01.devshift.net/#/silences/c81dd6a6-5bd3-416f-855b-e8b74ff0757f

      A/C

      • A decision is made if we keep the alert or not. If we keep it, we should not silence it anymore and update any relevant links (e.g. runbooks/dashboards)

              jlins@redhat.com Jéssica Lins (Inactive)
              jlins@redhat.com Jéssica Lins (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: