-
Epic
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
Reducing Node Reboots to Layered Pools
-
False
-
-
False
-
Not Selected
-
In Progress
-
67% To Do, 33% In Progress, 0% Done
-
0
Prerequisites
- OpenShift cluster with existing layered MachineConfigPool
- MachineOSConfig already configured and image built
- Layered image available in registry
- Understanding of Machine Config Daemon (MCD) behavior
Problem Statement
- In image mode, when you customize a node OS (via MOSC -> MOSB -> new image), nodes must reboot to pick up the new image.
- Today, this can lead to redundant reboots, including:
- One reboot for base OS updates
- One reboot for extensions/packages
- One reboot for when a cluster joins a layered pool
- Possibly another reboot during cluster upgrades
- Reboots are disruptive, so the ask is to make fewer and more coordinated reboots.
Specifically, we want to target the reboots associated with adding a new node to a layered pool, which require 2 OpenShift-managed reboots:
- Initial Cluster Join: Node joins as worker, gets base configuration, reboots
- Pool Assignment: Node is assigned to layered pool, MCD applies layered image, reboots again
Goal: Achieve a single reboot by having the node join the cluster already assigned to the layered pool, eliminating the intermediate worker pool membership.
Note: In addition to the 2 OpenShift-managed reboots, the node boots a third time when it initially starts up from the disk image. So total lifecycle boots is 3 from when a node starts up to joining a layered pool.