-
Bug
-
Resolution: Done
-
Critical
-
None
-
4.13.z, 4.14, 4.15
-
Critical
-
None
-
False
-
-
-
-
Description of problem:
The nodes got auto-removed when Vsphere IPI cluster faced network outage. Once the network outage issue fixed, the node seems to have rejoined the cluster after reboot.
Version-Release number of selected component (if applicable):
4.15.24, 4.14.11 Vsphere IPI
Steps to Reproduce:
KCM logs: ~~~ 2024-10-01T10:50:16.721201726+00:00 stderr F I1001 10:50:16.721189 1 attach_detach_controller.go:585] "Error removing node from desired-state-of-world" node="ocp-prod-s2xk6-infra-sd78r" err="failed to delete node \"ocp-prod-s2xk6-infra-sd78r\" from list of nodes managed by attach/detach controller--the node still contains 4 volumes in its list of volumes to attach" 2024-10-01T10:50:16.833345406+00:00 stderr F I1001 10:50:16.832134 1 attach_detach_controller.go:585] "Error removing node from desired-state-of-world" node="ocp-prod-s2xk6-worker-hkxnh" err="failed to delete node \"ocp-prod-s2xk6-worker-hkxnh\" from list of nodes managed by attach/detach controller--the node still contains 1 volumes in its list of volumes to attach" 2024-10-01T10:50:16.968415270+00:00 stderr F I1001 10:50:16.968408 1 attach_detach_controller.go:585] "Error removing node from desired-state-of-world" node="ocp-prod-s2xk6-worker-k7tfw" err="failed to delete node \"ocp-prod-s2xk6-worker-k7tfw\" from list of nodes managed by attach/detach controller--the node still contains 2 volumes in its list of volumes to attach" ~~~ ~~~ $ oc get nodes | grep 3d ocp-prod-s2xk6-master-2 Ready control-plane,master 3d v1.28.11+add48d0 ocp-prod-s2xk6-worker-hkxnh Ready worker 3d v1.28.11+add48d0 ocp-prod-s2xk6-worker-k7tfw Ready,SchedulingDisabled worker 3d v1.28.11+add48d0 ocp-prod-s2xk6-worker-rmcsp Ready,SchedulingDisabled worker 3d v1.28.11+add48d0 ocp-prod-s2xk6-infra-sd78r Ready infra,worker 3d v1.28.11+add48d0 ~~~
Actual results:
The node is getting auto-removed from the cluster.
Expected results:
The nodes should go to NotReady state only, but should not be removed from the cluster.
Additional info:
Must-gather of the affected cluster will be uploaded.