XML

Word

Printable

Type: Feature
Resolution: Done
Priority: Critical
Fix Version/s: openshift-4.17
Affects Version/s: None
Component/s: OS
Labels:

Activity Type:
Product / Portfolio Work
Parent Link:
None
Hierarchy Progress Bar:

0% To Do, 0% In Progress, 100% Done
Blocked:
False
Blocked Reason:
None
Ready:
False
Size:
None

Target Version:
None
Release Blocker:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Priority Data:
PX Impact Score:
PX Technical Impact:
None
PX Impact Range:
None
PX Scheduling Request:
None
PX Technical Impact Notes:

Release Note Text:
Undefined

Phase 2 Deliverable:

GA support for a generic interface for administrators to define custom reboot/drain suppression rules.

Epic Goal

Allow administrators to define which machineconfigs won't cause a drain and/or reboot.
Allow administrators to define which ImageContentSourcePolicy/ImageTagMirrorSet/ImageDigestMirrorSet won't cause a drain and/or reboot
Allow administrators to define alternate actions (typically restarting a system daemon) to take instead.
Possibly (pending discussion) add switch that allows the administrator to choose to kexec "restart" instead of a full hw reset via reboot.

Why is this important?

There is a demonstrated need from customer cluster administrators to push configuration settings and restart system services without restarting each node in the cluster.
Customers are modifying ICSP/ITMS/IDMS outside post day 1/adding them+
(kexec - we are not committed on this point yet) Server class hardware with various add-in cards can take 10 minutes or longer in BIOS/POST. Skipping this step would dramatically speed-up bare metal rollouts to the point that upgrades would proceed about as fast as cloud deployments. The downside is potential problems with hardware and driver support, in-flight DMA operations, and other unexpected behavior. OEMs and ODMs may or may not support their customers with this path.

Scenarios

As a cluster admin, I want to reconfigure sudo without disrupting workloads.
As a cluster admin, I want to update or reconfigure sshd and reload the service without disrupting workloads.
As a cluster admin, I want to remove mirroring rules from an ICSP, ITMS, IDMS object without disrupting workloads because the scenario in which this might lead to non-pullable images at a undefined later point in time doesn't apply to me.

Acceptance Criteria

CI - MUST be running successfully with tests automated
Release Technical Enablement - Provide necessary release enablement details and documents.
...

Dependencies (internal and external)

Previous Work (Optional):

Open questions::

Done Checklist

CI - CI is running, tests are automated and merged.
Release Enablement <link to Feature Enablement Presentation>
DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
DEV - Downstream build attached to advisory: <link to errata>
QE - Test plans in Polarion: <link or reference to Polarion>
QE - Automated tests merged: <link or reference to automated tests>
DOC - Downstream documentation merged: <link to meaningful PR>

clones

OCPSTRAT-380 Admin-defined node disruption: Tech Preview

Closed

is cloned by

OCPSTRAT-1550 Enhanced admin-defined reboot & drain policies

Refinement

is related to

RFE-4079 Configurable rebootless MachineConfigs

Approved

MCO-474 Investigate MCO reboot behavior when machine-os content hasn't changed during upgrade

Closed

relates to

CNV-35883 Enable defining schedule/acks/tuning for workloadUpdateStrategy

MCO-507 Admin-defined node disruption - Tech Preview

Closed

RFE-3549 Method to make simple configuation changes without forcing reboots

Closed

RFE-4661 Allow user to opt-out of IDMS / ITMS node drain on entry removal

Closed

links to

MCO-1125: OCPBUGS-35277: Allow paths to be defined for non-disruptive updates

openshift/enhancements#1525: MCO-507: admin defined reboot policy enhancement

openshift/machine-config-operator#4496: MCO-1065: MCO-1171: API bump for ManagedBootImages and NodeDisruptionPolicy GA

openshift/openshift-docs#81321: OCPSTRAT 1026:Admin-defined reboot & drain policies: Phase 2 (GA)

(3 relates to, 4 links to)

Assignee:: Mark Russell

Reporter:: Mark Russell

Doc Contact:: Matthew Werner

Product Operations Engineering Contact:: Derrick Ornelas

Need Info From:: None

Votes:: 1 Vote for this issue

Watchers:: 13 Start watching this issue

Created:: 2023/12/06 8:17 PM

Updated:: 2025/12/29 10:22 AM

Resolved:: 2024/09/30 1:53 PM

Target end:: 2024/08/13

Details

Description

Phase 2 Deliverable:

Epic Goal

Why is this important?

Scenarios

Acceptance Criteria

Dependencies (internal and external)

Previous Work (Optional):

Open questions::

Done Checklist

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates