-
Task
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
5
-
False
-
None
-
False
-
OTA 224
When degraded OSUS performance occurred on Aug 25 2022, no alerts fired. See this ticket for details: APPSRE-6192
Let's please create an alert that would have notified teams more quickly of this condition.
Slack channel : #incident_osus_high_latency_timeout link: https://coreos.slack.com/archives/C03UQ5U2CP9
RCA document: link
Definition of done:
- Create an alert when policy engine latency is more than 1 seconds.
- Create an alert when envoy has more than 100 pending requests.
- relates to
-
OTA-1082 Create alert for conditions that caused 2023-11-24 CannotRetrieveUpdate WebRCA-#itn-2023-00159
- Closed