-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.18.0, 4.19.0, 4.20.0, 4.21.0, 4.22
-
None
With a 3-node control-plane, ABI waits for 2 of the nodes to come up and join the cluster before rebooting the bootstrap/rendezvous node.
With a 4- or 5-node control-plane, ABI waits for... 2 of the nodes to come up and join the cluster before rebooting the bootstrap/rendezvous node.
Previously (in OCPBUGS-41811) we made ABI wait for all workers to join before rebooting the bootstrap, because of a race condition that could prevent those nodes from coming up if assisted-service was gone. (This code is specific to ABI, because in a regular assisted install, assisted-service is still available even after the bootstrap is rebooted to become a control-plane node.)
We are likely vulnerable to the same potential race with 4- or 5-node control planes. We should wait for n-1 control-plane nodes to be available before reboot in the ABI case.
- is related to
-
OCPBUGS-41811 4.17 Failed workers reboot in HA topology prevents cluster deployment completion
-
- Closed
-