Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: ROSA Classic
Labels:

Ready:
False
Color Status:
Not Selected
Hierarchy Progress:
0
Hierarchy Progress Bar:

0% 0%
Status Summary:

Hide

This feature continues to be in backlog due to other higher priority features. If your customer needs this feature, please link the case or opportunity to help reprioritization.

Show
This feature continues to be in backlog due to other higher priority features. If your customer needs this feature, please link the case or opportunity to help reprioritization.

Risk Score:
0

SFDC Cases Links:
SFDC Cases Counter:

Intelligence Requested:
Market:

Feature Overview (aka. Goal Summary)

This feature will introduce customization for cluster update/upgrade strategy by introducing a new configuration that will allow multiple cluster nodes to be upgraded in parallel and helping to reduce the overall upgrade time.

Goals (aka. expected user outcomes)

Customers can pass a non-zero value for Machine Config Pool parameter maxUnavailable and maxSurge at a cluster level that will be used during the cluster upgrade to upgrade as many nodes in parallel. This will allow parity between self-managed OCP and ROSA/OSD clusters.

Requirements (aka. Acceptance Criteria):

Both maxUnavailable and maxSurge are

Configurability at the machine pool level
Applicable only to the machine pools or the worker nodes that customer create/manage
Allow shorter range of values - (1,3) to begin with
Default is 1 (no change to defaults)
OCM UI, CAPA/CAPI, ROSA CLI, Terraform supports configuring this field.
1. With Terraform this would be a parameter of the ROSA cluster resource
Documentation will need an update in the upgrade section about the parameter, what it does and why it may be useful.

Use Cases (Optional):

Cluster administrators take planned maintenance window with the businesses so they'd like to shorten the window as much as possible within the limits of safety of the cluster but availability of the services is not a constraint.
The workloads have restrictive PDBs (maxunavailable=0%) so safely draining one node at a time delays if not fails the upgrade. A maintenance window is picked on the clusters when these workloads don't run and the window ought to finish before the workloads begin.
Administrators have self-managed OCP clusters using this capability and following same operations across different environments is preferred for migration of workloads to managed cloud services.

Assignee:: Balachandran Chandrasekaran

Reporter:: Balachandran Chandrasekaran

QA Contact:: Zhe Wang

Votes:: 2 Vote for this issue

Watchers:: 12 Start watching this issue

Created:: 2023/06/01 1:05 AM

Updated:: 2024/04/30 12:07 AM

Details

Description

Feature Overview (aka. Goal Summary)

Goals (aka. expected user outcomes)

Requirements (aka. Acceptance Criteria):

Use Cases (Optional):

Attachments

Activity

People

Dates