-
Enhancement
-
Resolution: Done
-
Major
-
None
-
None
-
False
-
None
-
False
-
No
-
MGDSRVS-48 - Be able to sustain an external paying customer in production
-
---
-
---
-
MK - Sprint 221
What
BrokerState metric should be exposed to Prometheus and added to a dashboard so that the support team can understand the state of the broker.
Why
The broker state reveals the current internal state of the broker. This important to understand the state of the service. This is critical information for the SRE when trying to diagnose problems with the service.
- The state the broker is in when it first starts up NOT_RUNNING((byte) 0)
- The state the broker is in when it is catching up with cluster metadata. STARTING((byte) 1)
- The broker has caught up with cluster metadata, but has not yet been unfenced by the controller. RECOVERY((byte) 2)
- The state the broker is in when it has registered at least once, and is accepting client requests. RUNNING((byte) 3)
- The state the broker is in when it is attempting to perform a controlled shutdown. PENDING_CONTROLLED_SHUTDOWN((byte) 6)
- The state the broker is in when it is shutting down. SHUTTING_DOWN((byte) 7),
- The broker is in an unknown state. UNKNOWN((byte) 127)
BrokerState is currently exposed via the labels on the metric, as these are far more readable, however that means we can't use them to track the time spent in each state. Exposing it as the value will allow us to get a view on how long kafka brokers spend in each state.
How
- Currently BrokerState is exposed via the labels with a fixed value. Expose it as the value of the metric as well.
Done
- Metric exposed recorded in prometheus as a value as well as labels
- clones
-
MGDSTRM-8172 Expose BrokerState from broker
- Closed