-
Story
-
Resolution: Done
-
Major
-
None
-
None
MCO will send an alert when more then two or more Kubelet failures occur
The alerts describes the following
- alert: KubeletHealthState
expr: |
mcd_kubelet_state > 2
labels:
namespace: openshift-machine-config-operator
severity: warning
annotations:
summary: "This keeps track of Kubelet health failures, and tallies them. The warning is triggered if 2 or more failures occur."
description: "Kubelet health failure threshold reached"
It is possible that admin may not be able to interpret exact action to be taken after looking at the alert and the cluster state. Adding runbook (https://github.com/openshift/runbooks) can help admin in better troubleshooting and taking appropriate action.
Acceptance Criteria:
- Runbook doc is created for KubeletHealthState alert
- Created runbook link is accessible to cluster admin with KubeletHealthStat alert
- relates to
-
MCO-427 Add missing runbooks for Prometheus rules
-
- Closed
-
- links to