Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-3068

Ensure correct mapping of alert severity between SOPs, alerts, and pages

XMLWordPrintable

    • IDH Sprint 18, IDH Sprint 1.9

      We got this report from the MT-SRE team regarding the failing image builds alerts:

      This alert's severity is marked as "critical" in its basic details but if you take a look at the severity details under "custom details" section, you'd notice that there, its severity is mentioned as "warning" and in the SOP, its severity is marked as "Minor". We (MTSRE) would love to get some insights upon this incoherence i.e. is this alert a critical alert or just a warning, and if it's just a warning, then should we (MTSRE) get paged about these kinds of "warning" alerts

      We should audit the alerting rules, the corresponding Pagerduty severity for alerts, and our SOPs documentation to ensure that alert severity is in alignment across all of these locations.

       

      [UPDATE] 

       

      We’ve updated all the alerts that go to the MTSRE team to critical, and kept the severity of the rest.

       

      PRs

       

              lferrnan@redhat.com Lucas Fernandez Aragon
              acorvin@redhat.com Alex Corvin
              Jorge Garcia Oncins Jorge Garcia Oncins
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: