-
Story
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
None
-
False
User Story
As a user I would like to be able to clearly tell when instances are failing to create from the metrics series defined by mapi_instance_create_failed. If the labels on this metric included the machineset name (when available) as well, it will make diagnosing groupings of failures easier.
Background
When analyzing metric output it can be very difficult to isolate the source of failures that come from this metric. Having the machineset name as a label, and empty for machines not in machinesets, will make it easier for people to interpret the results from the mapi_instance_create_failed time series.
Steps
- add "machineset" as a label to the mapi_instance_create_failed metric, populated by the machineset name if available.
Stakeholders
- openshift engineering
Definition of Done
- update metric series code
- Docs
- update metrics doc in mao repo
- Testing
- we do not currently have an e2e test for creation failures
- blocks
-
OCPCLOUD-1614 Maintainability: Add an alert for when mapi_instance_create_failed is high for a long period of time
- To Do