-
Epic
-
Resolution: Can't Do
-
Critical
-
None
-
openshift-4.11, openshift-4.12, openshift-4.13, openshift-4.14
-
Lockstep hibernation with MachineConfigPool upgrades
-
False
-
None
-
False
-
Not Selected
-
In Progress
-
OCPSTRAT-543 - Shutdown/Resume of managed OSD/ROSA clusters
-
OCPSTRAT-543Shutdown/Resume of managed OSD/ROSA clusters
One of the criterion for hibernation in OSD/ROSA is
Cluster shutdown must be blocked if the MachineConfigPools are in updating state.
There is an inherent timing problem with simply effecting
if isUpgrading(MCO) { return errors,New("Can't hibernate during MCO upgrade") } hibernate()
as an upgrade could kick off between when we check and when we initiate the hibernation. Thus it would have to look more like:
freezeMCOUpgrades() if isUpgrading(MCO) { unfreezeMCOUpgrades() return errors.New("Can't hibernate during MCO upgrade") } hibernate() // ...and then in the resume flow resume() unfreezeMCOUpgrades()
This assumes a freezeMCOUpgrades() is possible. I'm told it is – but if you freeze in the middle of an upgrade, MCO will finish whatever machine it's on and leave the rest. So some additional coordination will be necessary to figure out how to freeze either before or after that whole process.
I'm also told that in 4.13+, cert rotation is now done independently of the upgrade procedure. Assuming cert rotation is the motivation behind the original restriction ("no hibernation during MCO upgrades") this may make this issue moot for 4.13+... but add an extra criterion to the logic for <4.13.
- is blocked by
-
MCO-638 MCO behavior to consider during cluster hibernation
- Closed