Feature Overview and Background

The MCO team has reported several classes of issues with the MCO that can cause support cases or block upgrades. Additionally, there are changes that the team believes will improve the maintainability of the code and make it easier to troubleshoot.

4.7 Phase - Wait for All Worker Pools on Upgrade

Today when the an upgrade is initiated, the CVO will report that the upgrade is complete after the master pool has been upgraded.
Other pools may not have completed due to an upgrade problem or a perfectly valid condition like pausing reconciliation on one or more pools.
In either an unintentional (there is an error preventing upgrade of a worker) or intentional situation (a pool is paused), the administrator can initiate another upgrade before the previous one has been rolled out to the full cluster.
Why this is important
- Cluster administrators can get themselves into a state where the cluster itself states that it is upgraded when, in fact, it isn't fully. The end result is somewhere between releases especially on the compute side. We want to avoid a minor version skew between control plane and compute nodes (z stream skews are acceptable for k8s instead). This will lower the number of bug report that the team gets because the admin started an upgrade which degraded the compute pool w/o noticing and moved on to another upgrade leaving compute at 4.(y-2).

Future work

Fault Tolerant MCD - https://issues.redhat.com/browse/GRPA-2682

Best Effort Upgrade on Degraded MCO: https://issues.redhat.com/browse/GRPA-1641

Rework Kubeletconfig and Containerruntimeconfig Controllers - https://issues.redhat.com/browse/GRPA-2679

Validate pullsecret before writing it: https://issues.redhat.com/browse/GRPA-2699

Also related: Bootimage Updates: https://issues.redhat.com/browse/GRPA-2680

Assignee:: Mark Russell

Reporter:: Mark Russell

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2020/08/27 7:45 PM

Updated:: 2025/12/20 2:13 AM

Resolved:: 2022/10/07 2:12 AM

Details

Description

[Sept 3. Note: this might need to be broken into 2 issues]

Feature Overview and Background

4.7 Phase - Wait for All Worker Pools on Upgrade

Future work

Attachments

Easy Agile Planning Poker

Activity

People

Dates