-
Enhancement
-
Resolution: Done
-
Major
-
None
-
None
WHAT
Modify existing alerts to ignore Kafka instances where the bf2.org/suspended=true label has been set.
WHY
Suspended alerts are expected to be offline/unavailable and should not trigger alerts or notifications.
HOW
kas-fleetshard will be exposing a new metric kafka_instance_suspended (MGDSTRM-9789) that will convey whether an instance is suspend or not. Use this metric in the queries underlying the following alerts to suppress the alert when the instance is suspended.
- ZookeeperContainersDown
- KafkaBrokerContainersDown
- CruiseControlContainerDown
- CanaryNotActive
The recording rules underlying the SLOs must exclude instances that are suspended.
DONE
Include the following where applicable:
- alerts updated
- supporting unit tests added
- depends on
-
MGDSTRM-9789 Suspend and resume a Kafka instance based on presence of label
- Closed
- mentioned on