-
Bug
-
Resolution: Done
-
Critical
-
RHODS_1.10.0_GA
-
1
-
False
-
None
-
False
-
Yes
-
-
-
-
-
-
1.12.0-4
-
No
-
No
-
Yes
-
None
-
RHODS 1.12
-
Medium
Description of problem:
The RHODS Prometheus alerts that send PagerDuty alerts to MT-SRe are calculated for an SLO of 98%, as that was the SLO defined for RHODS Field Trials
- SLOs-haproxy_backend_http_responses_total
- SLOs-probe_success
As the Service Level Objective is 99.95% for RHODS LA, I think these alerts should be updated
Note: I think those rules were generated using this online tool https://promtools.dev/alerts/errors
Prerequisites (if any, like setup, operators/versions):
Steps to Reproduce
- Login to RHODS Prometheus
- Go to Status > Rules
- Verify the expression for SLOs-haproxy_backend_http_responses_total and SLOs-probe_success
Build Details:
RHODS 1.10.0
Live Build:
quay.io/lferrnan/rhods-operator-live-catalog:1.11.1-prometheus-slo
PR:
https://github.com/red-hat-data-services/odh-deployer/pull/232