-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
None
-
None
-
False
-
-
False
-
-
-
Important
Description of problem:
1. Have a Node Health Check and Single Node Remediation operators. 2. After that they have shutdown one of the node by directly removing the power-cable. 3. After again connecting the power cable the node is stuck into the `Ready|SchedulingDisabled` status and the `out-of-service` taint is getting applied to the node. ~~~ "masternode3.rhocpclusterdc.powergrid.in" { "taints": [ { "effect": "NoExecute", "key": "node.kubernetes.io/out-of-service", "timeAdded": "2025-03-03T13:00:41Z", "value": "nodeshutdown" } ] } ~~~ Due to this the pods are not able to schedule on that node. 4. Try to uncordon the and node comes in Ready state but the taint is still present and due to that no pod is getting schedule on that node.
Version-Release number of selected component (if applicable):
How reproducible:
Yes
Steps to Reproduce:
1. Installed the node-healthcheck-operator and self-node-remediation operator. 2. Then try to reboot the node after that you will see the operator will apply the taint for `out-of-service`. 3. And even if the node is in healthy state still the operator is considering the node as unhealthy and taint is still there and resulting not able to schedule any pod on that node.
Actual results:
Even if the node is in healthy state still the operator is considering the node as unhealthy and taint is still there and resulting not able to schedule any pod on that node.
Expected results:
Once the node comes into healthy state (Ready) the `out-of-service` taint should remove automatically and the pods are able to schedule on node.
Additional info:
The below taint is applied on the affected node by operator: ~~~ Taints: medik8s.io/remediation=self-node-remediation:NoExecute node.kubernetes.io/out-of-service=nodeshutdown:NoExecute node.kubernetes.io/unschedulable:NoSchedule ~~~ Also, The `apiserver` pods are stuck in pending state due to this taint of the affected node.