-
Story
-
Resolution: Duplicate
-
Normal
-
None
-
None
-
Quality / Stability / Reliability
-
5
-
False
-
-
False
-
None
-
Unset
-
None
-
-
The severity of a Prometheus alert not only describes how important the alert is, but also determines whether the on-call engineers should be paged when the alert is fired.
The following severities will trigger the on-call process: warning, high, critical. They can only be used with alerts that are properly tested in app-interface and have an associated runbook.
The vast majority of our alerts have an info or medium severity while some of them should trigger the on-call process. With this ticket, we'll determine which alerts should have their severity increased.
Acceptance criteria:
- The severity from each existing alert is reviewed.
- All alerts that need a higher severity are listed in a spreadsheet with details about whether the alert is tested and a runbook exists.