-
Feature
-
Resolution: Unresolved
-
Major
-
None
-
None
-
Product / Portfolio Work
-
-
- Color Status: Green
- Status summary:
- Pending HO annotator for 4.18, 4.19 and 4.20 HC support
- Risks:
-
False
-
-
False
-
None
-
9
-
None
-
None
-
None
-
-
None
-
None
-
None
-
None
Feature Overview
Enable the autoscaling from/to zero for NodePools introduced in OCPSTRAT-1525 of Hosted Clusters for releases 4.18, 4.19 and 4.20.
Goals
- Allow HCP node pools to scale down to zero nodes when not in use
- Enable autoscaling to automatically provision nodes when workloads require them
- Provide a seamless experience for users managing compute resources in HCP clusters
Primary user type: Cluster Service Consumers
Expands on existing features: Node pool management and autoscaling capabilities
Requirements
- Implement the ability to set min-replicas=0 and autoscale=y simultaneously for HCP node pools
- Ensure proper draining of nodes when scaling down to zero
- Handle degraded operators gracefully when no data plane compute is available
- Implement changes in the cluster-autoscaler machinery to support scaling from/to zero
- (optional/desired) Optimize performance to minimize the time required to scale up from zero
Deployment considerations
- Self-managed, managed, or both: works for both
- Classic (standalone cluster): N/A
- Hosted control planes: Applicable
- Multi node, Compact (three node), or Single node (SNO), or all: N/A
- Connected / Restricted Network: Both
- CPU Architectures: all
- Operator compatibility: Ensure compatibility with relevant operators
- Backport needed: To be determined based on priority and release schedule
- UI need: should be tracked in a separate OCM Jira
Use Cases
- Cost optimization: Customers can scale down to zero nodes during off-hours or low-demand periods
- On-demand scaling: Automatically provision nodes when workloads require them
- Development and testing: Easily spin up and down compute resources for development and testing environments
Out of Scope
- Implementing this feature for non-HCP clusters
- Full Hibernation functionality, including the control-plane (as mentioned in the discussion)
Background
This feature is being requested to provide more flexibility and cost-efficiency for HCP cluster management. It builds upon the existing autoscaling capabilities and addresses limitations in current node pool management when combined with autoscaling.
Customer Considerations
- Provide clear documentation on the implications of scaling to zero (e.g., degraded operators)
- Ensure a smooth user experience when transitioning between zero and active nodes
- Consider potential impact on SLAs and cluster responsiveness
Documentation Considerations
- Create new documentation explaining how to enable and use autoscaling from/to zero
- Update existing node pool and autoscaling documentation to include this new functionality
- Provide best practices and considerations for using this feature
Interoperability Considerations
- Ensure compatibility with ROSA HCP
- Verify interoperability with other OpenShift components and operators
- Consider the impact on monitoring and logging systems when scaling to/from zero
- clones
-
OCPSTRAT-1525 Enable autoscaler from/to zero on Hypershift
-
- In Progress
-