XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: Hosted Control Planes
Labels:

Work Type:
BU Product Work
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Hierarchy Progress Bar:

100% To Do, 0% In Progress, 0% Done

Business Value:
8
Risk Score:
0

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Intelligence Requested:
Market:

Feature Overview

Enable autoscaling from/to zero for NodePools of Hosted Clusters, allowing customers to efficiently manage resources and reduce costs when compute is not needed.This can be done by having CAPI support to enable autoscaler from/to zero (which should be incorporated into the HyperShift operator logic)

Goals

Allow HCP node pools to scale down to zero nodes when not in use
Enable autoscaling to automatically provision nodes when workloads require them
Provide a seamless experience for users managing compute resources in HCP clusters

Primary user type: Cluster Service Consumers

Expands on existing features: Node pool management and autoscaling capabilities

Requirements

Implement the ability to set min-replicas=0 and autoscale=y simultaneously for HCP node pools
Ensure proper draining of nodes when scaling down to zero
Handle degraded operators gracefully when no data plane compute is available
Implement changes in the cluster-autoscaler machinery to support scaling from/to zero
(optional/desired) Optimize performance to minimize the time required to scale up from zero

Deployment considerations

Self-managed, managed, or both: works for both
Classic (standalone cluster): N/A
Hosted control planes: Applicable
Multi node, Compact (three node), or Single node (SNO), or all: N/A
Connected / Restricted Network: Both
CPU Architectures: all
Operator compatibility: Ensure compatibility with relevant operators
Backport needed: To be determined based on priority and release schedule
UI need: should be tracked in a separate OCM Jira

Use Cases

Cost optimization: Customers can scale down to zero nodes during off-hours or low-demand periods
On-demand scaling: Automatically provision nodes when workloads require them
Development and testing: Easily spin up and down compute resources for development and testing environments

Questions to Answer

What changes are required in the cluster-autoscaler machinery to support scaling from/to zero?
How will we handle the transition from zero nodes to active autoscaling?
What impact will this feature have on cluster startup time when scaling up from zero?
Are there any potential security implications of allowing clusters to scale to zero nodes?

Out of Scope

Implementing this feature for non-HCP clusters
Full Hibernation functionality, including the control-plane (as mentioned in the discussion)

Background

This feature is being requested to provide more flexibility and cost-efficiency for HCP cluster management. It builds upon the existing autoscaling capabilities and addresses limitations in current node pool management when combined with autoscaling.

Customer Considerations

Provide clear documentation on the implications of scaling to zero (e.g., degraded operators)
Ensure a smooth user experience when transitioning between zero and active nodes
Consider potential impact on SLAs and cluster responsiveness

Documentation Considerations

Create new documentation explaining how to enable and use autoscaling from/to zero
Update existing node pool and autoscaling documentation to include this new functionality
Provide best practices and considerations for using this feature

Interoperability Considerations

Ensure compatibility with ROSA HCP
Verify interoperability with other OpenShift components and operators
Consider the impact on monitoring and logging systems when scaling to/from zero

Assignee:: Unassigned

Reporter:: Alberto Garcia Lamela

Doc Contact:: Matthew Werner

Votes:: 1 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2023/08/01 2:55 PM

Updated:: 2024/12/11 10:41 PM

Details

Description

Feature Overview

Goals

Requirements

Deployment considerations

Use Cases

Questions to Answer

Out of Scope

Background

Customer Considerations

Documentation Considerations

Interoperability Considerations

Attachments

Easy Agile Planning Poker

Activity

People

Dates