-
Epic
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
A list of ideas that would be good to work on one day, but are not targeted at any specific release.
User stories:
- If a node becomes unhealthy, an administrator wants to completely isolate the node from a cluster to prevent to access the shared resources like RWO volumes.
- If a node becomes unhealthy and the workload is not rescheduled elsewhere, an administrator can clarify the root cause from the remediator logs.
Previous Work:
Fence Agents Remediation (FAR) was developed as an alternative remediator that can work with Node Healthcheck Operator (NHC). FAR uses upstream fence-agents from ClusterLabs and a management interface, e.g. IPMI, to facilitate/communicate with the machines/nodes, unlike the previous remeidator, Self Node Remediation (SNR), which was independent and used a watchdog device.
Using SNR we can't keep the node powered off and we can't get the power status of the node, since it doesn't have a control plane connectivity to the unhealthy node's management interface.
Customers:
- NEC