Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Critical
Fix Version/s: 4.16.0
Affects Version/s: 4.14.z, 4.15.0
Component/s: HyperShift
Labels:
- self-managed
- triaged

Regression:
No
Sprint:
Hypershift Sprint 246, Hypershift Sprint 247
sprint_count:
2
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Type:
Release Note Not Required
Release Note Status:
In Progress
Target Version:

4.16.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:


In the self-managed HCP use case, if the on-premise baremetal management cluster does not have nodes labeled with the "topology.kubernetes.io/zone" key, then all HCP pods for a High Available cluster are scheduled to a single mgmt cluster node.

This is a result of the way the affinity rules are constructed.

Take the pod affinity/antiAffinity example below, which is generated for a HA HCP cluster. If the "topology.kubernetes.io/zone" label does not exist on the mgmt cluster nodes, then the pod will still get scheduled but that antiAffinity rule is effectively ignored. That seems odd due to the usage of the "requiredDuringSchedulingIgnoredDuringExecution" value, but I have tested this and the rule truly is ignored if the topologyKey is not present.

        podAffinity: 
          preferredDuringSchedulingIgnoredDuringExecution: 
          - podAffinityTerm: 
              labelSelector: 
                matchLabels: 
                  hypershift.openshift.io/hosted-control-plane: clusters-vossel1
              topologyKey: kubernetes.io/hostname
            weight: 100
        podAntiAffinity: 
          requiredDuringSchedulingIgnoredDuringExecution: 
          - labelSelector: 
              matchLabels: 
                app: kube-apiserver
                hypershift.openshift.io/control-plane-component: kube-apiserver
            topologyKey: topology.kubernetes.io/zone

In the event that no "zones" are configured for the baremetal mgmt cluster, then the only other pod affinity rule is one that actually colocates the pods together. This results in a HA HCP having all the etcd, apiservers, etc... scheduled to a single node.

Version-Release number of selected component (if applicable):

4.14

How reproducible:

100%

Steps to Reproduce:

1. Create a self-managed HA HCP cluster on a mgmt cluster with nodes that lack the "topology.kubernetes.io/zone" label

Actual results:

all HCP pods are scheduled to a single node.

Expected results:

HCP pods should always be spread across multiple nodes.

Additional info:


A way to address this is to add another anti-affinity rule which prevents every component from being scheduled on the same node as its replicas

blocks

OCPBUGS-28764 Self-managed HCP pods are scheduled on single mgmt cluster node when no zones are in use

Closed

is cloned by

OCPBUGS-28764 Self-managed HCP pods are scheduled on single mgmt cluster node when no zones are in use

Closed

ACM-11454 Self-managed HCP pods are scheduled on single mgmt cluster node when no zones are in use

Closed

links to

openshift/hypershift#3286: OCPBUGS-22899: node spread anti-affinity for HA HCP

RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update

Assignee:: Seth Jennings

Reporter:: David Vossel

QA Contact:: Liangquan Li

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Created:: 2023/11/02 9:21 PM

Updated:: 2024/06/27 11:35 AM

Resolved:: 2024/06/27 11:35 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates