Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Obsolete
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.16.0
Component/s: HyperShift
Labels:
- triaged

Severity:
Critical
Regression:
No
Sprint:
Hypershift Sprint 252, Hypershift Sprint 253
sprint_count:
2
Release Blocker:
Proposed
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Links:

Description

Description of problem:

   When using cluster size tagging, and a number of placeholder pods is configured to keep nodes warm, in some cases pods of 2 deployments can be scheduled on different zones but the same subnet pair (1 of each) so that the deployments never finish rolling out.

Version-Release number of selected component (if applicable):

    4.16.0 - main

How reproducible:

Sometimes

Steps to Reproduce:

    1. Setup a management cluster with request-serving machinesets
    2. Configure clustersizingconfiguration to have at least 2 pairs of warm nodes for the smallest size.

Actual results:

    For a given pair of deployments, 1 pod of each deployment is scheduled, with the other left in pending.

Expected results:

    For a given pair of deployments, all pods are scheduled.

Additional info:

      Example state:

$ oc get pods -n hypershift-request-serving-node-placeholders
NAME                                   READY   STATUS    RESTARTS   AGE
placeholder-small-0-7b696687d8-cnf6r   0/1     Pending   0          4h59m
placeholder-small-0-7b696687d8-qb2bz   1/1     Running   0          4h59m
placeholder-small-1-575fc487df-8rgxd   0/1     Pending   0          4h59m
placeholder-small-1-575fc487df-9bt7n   1/1     Running   0          4h59m

Pods from small-0 and small-1 are scheduled on nodes with the same osd-fleet-manager.openshift.io/paired-nodes label, resulting in pending pods that will never schedule.

Attachments

Activity

People

Assignee:: Cesar Wong

Reporter:: Cesar Wong

QA Contact:: Jie Zhao

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2024/04/17 7:32 PM

Updated:: 2024/04/30 3:49 PM

Resolved:: 2024/04/30 3:49 PM