Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-1922

Revisit ThanosRuleRuleEvaluationLatencyHigh alert

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Obsolete
    • Icon: Undefined Undefined
    • None
    • None
    • Observatorium
    • 0

      As discussed here we silenced the ThanosRuleRuleEvaluationLatencyHigh alert.

      Now in the scope of allowing tenants to define their own recording rules, we should revisit the alert, having in mind that:

      • Now users have the option to configure recording rules, allowing them to define rules of any length/duration.
      • Rules can take ages to evaluate - which can bring this alert, in case not silenced anymore, to fire more often
      • How do we manage alerts in general, in this context?

      Link to the alert silence: https://alertmanager.telemeter-prod-01.devshift.net/#/silences/912d846d-4d04-4354-a59e-47287ecb564b
      Alert silence was extended until 01-05-2022 since we didn't get to this issue yet. New alert silence link: https://alertmanager.telemeter-prod-01.devshift.net/#/silences/c81dd6a6-5bd3-416f-855b-e8b74ff0757f

      A/C

      • A decision is made if we keep the alert or not. If we keep it, we should not silence it anymore and update any relevant links (e.g. runbooks/dashboards)

            jlins@redhat.com Jéssica Lins (Inactive)
            jlins@redhat.com Jéssica Lins (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: