Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.22.0, 4.21.z
Component/s: Node / Kubelet
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:

4.21.z
Target Version:

4.22.0
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

When using Hive to create OpenShift clusters with autoscaling enabled, after adding workloads, the scale-down functionality fails to operate correctly, resulting in the cluster being unable to automatically scale down based on actual workload requirements.

Version-Release number of selected component (if applicable):

quay.io/openshift-release-dev/ocp-release:4.21.0-x86_64 - Failed
registry.ci.openshift.org/ocp/release:4.22.0-0.nightly-2026-01-31-082403 - Failed
quay.io/openshift-release-dev/ocp-release:4.20.13-x86_64 - PASS!

How reproducible:

    Always

Steps to Reproduce:

Steps to Reproduce:
1. Create a Hive ClusterDeployment with a MachinePool that has autoscaling (minReplicas=10, maxReplicas=12) on a multi-AZ platform (e.g. AWS 3 AZs).
2. Wait for cluster to install and scale to min 10 workers; MachineSets stabilize at [4, 3, 3].
3. Deploy a workload on spoke cluster (e.g. a busybox deployment) that consumes capacity so the autoscaler scales up to 12 workers; MachineSets become [4, 4, 4].
4. Delete the busybox deployment. 
5.The autoscaler should scale the pool back down to min 10 workers.

cluster-autoscaler logs:
I0203 23:01:16.484087       1 eligibility.go:163] Node ip-10-0-46-119.us-east-2.compute.internal unremovable: memory requested (52.8758% of allocatable) is above the scale-down utilization threshold

Actual results:

    Scale-down to 10 never occurs; worker count remains 12.

Expected results:

Cluster Autoscaler should consider scale-down for all worker nodes in all three MachineSets.
Two of the three MachineSets have min=3 (pool min 10 distributed as 4+3+3); 
CA should be able to scale down one node in each of those two MachineSets, reducing total workers from 12 to 10.

Additional info:

slack: https://redhat-internal.slack.com/archives/CE3ETN3J8/p1770133981878329

Please refer to HIVE-3068 for more log file information.

is depended on by

HIVE-3068 Cluster Autoscaler scale-down stuck

To Do

relates to

HIVE-3068 Cluster Autoscaler scale-down stuck

To Do

links to

openshift/machine-config-operator#5716: OCPBUGS-75869: kubelet: Less aggressive low memory reservation

Assignee:: Node Team Bot Account

Reporter:: Mingxia Huang

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2026/02/04 12:02 AM

Updated:: 2026/02/27 3:50 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates