-
Spike
-
Resolution: Done
-
Major
-
None
-
None
-
None
WHAT
We need to review the metrics Cruise Control itself is capable of producing and for each work out the potential usefulness to the managed service in helping diagnose problems, understanding behaviour etc. We probably don't want to retain all metrics owing to the storage head and some metrics might not be useful for our use-case..
The task is to determine a list of Cruise Control metrics that should be integrated into managed kafka. If any metric signals an abnormal condition, consider if an prometheus alert and SOP are appropriate.
WHY
Support Cruise Control
HOW
Review list of metrics.
DONE
- List of metrics to be considered for inclusion (this feeds
MGDSTRM-8048) - List of metrics to be considered for service alert(s). Raise task(s) as JIRA.
- blocks
-
MGDSTRM-8048 Implement prometheus scraping for Cruise Control
- Closed