Loading...

XML

Word

Printable

Type: Epic
Resolution: Done
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Labels:
- mco_doc_required
- mco_qe_required

Epic Name:
Admin-defined reboot and drain
Story Points:
5
Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Epic Status:
Done
Feature Link:
OCPSTRAT-380 - Admin-defined node disruption: Tech Preview
Parent Link:
OCPSTRAT-380Admin-defined node disruption: Tech Preview
Hierarchy Progress Bar:

0% To Do, 4% In Progress, 96% Done
Size:
M
Target Version:

openshift-4.16

Cost of Delay:
0
WSJF:
0.000

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

This epic is another epic under the "reduce workload disruptions" umbrella.

This is now updated to get us most of the way to ~~MCO-200~~ (Admin-Defined reboot & drain), but not necessarily with all the final features in place.

This epic aims to create a reboot/drain policy object and a MCO-management apparatus for initial functionality with MachineConfig backed updates, with a restricted set of actions for the user. We also need reboot/drain policy object for ImageContentSourcePolicy, ImageTagMirrorSet and ImageDigestMirrorSet to avoid drains/reboots when admins use these APIs and have other ways of ensuring image integrity.,

This mostly focuses on the user interface for defining reboot/drain policies. We will also need this for the layering "live apply" cases and bifrost-backed updates, to be implemented into a future update.

The MCO's reboot and drain rules are currently hard-coded in the machine-config-daemon here.

Node drains also occur even beyond OCP 4.9 when not just adding but also removing ICSP, ITMS, IDMS objects or single mirroring rules in their configuratuion according to RFE-3667.

This causes at least three problems:

A user does not know what the rules are unless they read the code (the rules aren't visible to the user)
The controller can't see the rules to "pre-compute" the effect that a MachineConfig will have on a Node before that MachineConfig is delivered (which makes it hard for a user to know what will actually happen if they apply a config)
The only way for a template owner to mark their config as "does not require reboot" is to edit the MCD code

Done when:

A CRD is defined for post config action policies covering both MCO and ICSP/ITMS/IDMS APIs
The existing daemon rules are broken out into one of these resources
The reboot/drain policies are visible in the cluster (e.g. "oc get rebootpolicies")
The drain controller handles processing and validation of the user's policies (and could put the computed post-config actions in the machineconfig's and ICSP/ITMS/IDMS status or the custom image's metadata if layering)
A template owner has a procedure to mark that their template config changing does/does not require a reboot

is related to

MCO-517 Prevent node availabilty check when the kubelet is shutdown

Closed

OCPSTRAT-380 Admin-defined node disruption: Tech Preview

Closed

OCPSTRAT-1026 Admin-defined node disruption policies: Phase 2 (GA)

Closed

MCO-474 Investigate MCO reboot behavior when machine-os content hasn't changed during upgrade

Closed

relates to

OCPBUGS-32783 NodeDisruptionPolicy action reload cannot take effect

Verified

OCPBUGS-32511 NodeDisruptionPolicyStatus was not ready context deadline exceeded

Verified

OCPBUGS-32739 MachineConfigurations is only effective with name <cluster>

Closed

links to

openshift/enhancements#1525: MCO-507: admin defined node disruption policy enhancement

openshift/machine-config-operator#4267: MCO-1009: MCO-1008: MCO-905: Implement NodeDisruptionPolicy API

openshift/openshift-docs#77526: Document MCO drain behavior

(2 relates to, 3 links to)

Assignee:: Yu Qi Zhang

Reporter:: John Kyros

Contributors:: Team MCO

QA Contact:: Rio Liu

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2023/02/21 2:17 AM

Updated:: 2024/06/28 1:57 PM

Resolved:: 2024/06/28 1:56 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates