Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-2214

[upstream] Investigate and maybe implement an option to send resolved notifications on inhibited alerts

XMLWordPrintable

    • Enable resolved notifications on inhibited alerts
    • False
    • False
    • NEW
    • To Do
    • MON-3156Upstream improvements
    • NEW
    • Sprint 215

      Epic Goal

      • Add option to resolve inhibited alerts for some notification receivers (e.g Pagerduty)

      Why is this important?

      • Reduce toil for people managing alerts in external tools that get notified by alertmanager instances
      • Improve signal to noise ration and thus make alerts more effective

      Scenarios

      1. The highlevel scenario is that when an alert fires, a notification is send to the appropriate notification channel. If the alert is then silenced/inhibited by another alert and resolved while in that state, the notification channel is never notified of the resolved state. As a consequence the alert will still be listed as firing in tools like pagerduty.
      2. From the linked feature request:

      Prom = Prometheus
      AM = Alertmanager
      PD= PagerDuty

        1. Prom: alert1 fires
          1. AM gets alert1, routes to PD
          2. PD receives alert1
        2. Prom: alert2 fires
          1. AM gets alert2
          2. AM suppresses alert1
          3. AM routes alert2 to PD
          4. PD receives alert2
          5. PD now has two alerts
            1. Alert1
            2. Alert2
        3. Prom: alert1 resolves
          1. AM resolves alert1
          2. PD receives no notification as the alert is suppressed
          3. PD: alert1 becomes orphaned
        4. Prom: alert2 resolves
          1. AM resolves alert2
          2. PD receives resolved notification for alert2
          3. PD resolves alert
          4. PD retains alert1 that is now orphaned

      Acceptance Criteria

      • Upstream consensus
      • backwards compatibility

       

      Previous Work (Optional):

      1. https://github.com/prometheus/alertmanager/issues/226 
      2. https://github.com/prometheus/alertmanager/issues/2754 

              spasquie@redhat.com Simon Pasquier
              jfajersk@redhat.com Jan Fajerski
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: