-
Feature Request
-
Resolution: Unresolved
-
Normal
-
None
-
openshift-4.11
-
False
-
None
-
False
-
Not Selected
-
-
-
-
-
1. Proposed title of this feature request
Allow retaining failed worker/infra nodes after MachineHealthCheck recreates a new node
2. What is the nature and description of the request?
When a OpenShift worker/infra node fails for some reason, MachineHealthCheck (MHC) will recreate a new node and remove the old one. This RFE requests MHC to allow retaining failed nodes for troubleshooting and recurrence prevention.
3. Why does the customer need this? (List the business requirements here)
When a issue happens, Our customers often request us to explain a root cause of the issue. Logging in to the failed node is one of the useful methods for root cause analysis, which is impossible when MHC deletes the failed node. MHC is so helpful for maintaining high availability that disabling it is not an option. Adding this feature aims to keep useful info for troubleshooting and recurrence prevention while achieving high availability.
4. List any affected packages or components.
OpenShift 4.x Machine Health Check