-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.18.z
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
In the OCP 4.18 Working with nodes documentation, the guidance for rebooting a (SNO) cluster looks confusing.
1) In Rebooting a node gracefully”, the procedure is:
- oc adm cordon
- oc adm drain
- reboot (systemctl reboot)
This section also mentions SNO:
“In a single-node OpenShift cluster, pods cannot be rescheduled when cordoning and draining. However, doing so gives the pods time to properly stop and release resources.”
This implies that the cordon + drain procedure is the supported and recommended approach, even for SNO.
2) In SNO clusters reboot without drain, the doc explains that a reboot without drain can lead to pods in UnexpectedAdmissionError state (for example pods requesting devices).
However, this section also contains the following note:
“Note: The option to drain the node is unavailable for single-node OpenShift clusters.
These two sections are in direct conflict:
- Section 1 recommends draining the node and explicitly mentions SNO.
- Section 2 states that draining is unavailable for SNO.
My understanding is the recommended / supported reboot procedure for SNO is the graceful reboot using cordon + drain, but the Note in Section 2 should be clarified or updated to explain what ‘the option to drain the node is unavailable’ means. It might be trying to say that workloads cannot be migrated to another node