-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.17, 4.18
-
None
-
1
-
Workloads Sprint 269
-
1
-
False
-
Description of problem:
Pods configured with preferredDuringSchedulingIgnoredDuringExecution node affinity do not get evicted when the cluster state changes, and a node that fits their preferred node affinity becomes available. Even with the OpenShift KubeDescheduler configured with the AffinityAndTaints profile, the pods remain on their current node instead of moving back to their preferred node once it becomes healthy. Upon checking the cluster ConfigMap in the openshift-kube-descheduler-operator namespace, it was observed that the DeschedulerPolicy only includes requiredDuringSchedulingIgnoredDuringExecution under nodeAffinityType, while preferredDuringSchedulingIgnoredDuringExecution is missing.
Version-Release number of selected component (if applicable):
4.17
How reproducible:
100%
Steps to Reproduce:
1. Deploy a pod with below nodeAffinity: spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: kubernetes.io/hostname operator: In values: - worker-4 weight: 100 2. Ensure the preferred node (worker-4) becomes unavailable, forcing the pod to be scheduled on an alternate node. 3. Restore worker-4 to a healthy state. 4. Observe that the pod does not get evicted and rescheduled back to worker-4.
Actual results:
Pods remain on the alternate node indefinitely, as the descheduler does not consider preferredDuringSchedulingIgnoredDuringExecution node affinity for eviction.
Expected results:
The descheduler should consider preferredDuringSchedulingIgnoredDuringExecution node affinity and evict the pod when its preferred node becomes available.
Additional info:
KubeDescheduler Configuration: apiVersion: operator.openshift.io/v1 kind: KubeDescheduler metadata: name: cluster namespace: openshift-kube-descheduler-operator spec: profiles: - AffinityAndTaints - EvictPodsWithLocalStorage - EvictPodsWithPVC oc get cm cluster -n openshift-kube-descheduler-operator nodeAffinityType: - requiredDuringSchedulingIgnoredDuringExecution The descheduler policy in OpenShift only includes requiredDuringSchedulingIgnoredDuringExecution but does not handle preferredDuringSchedulingIgnoredDuringExecution, causing pods to remain on their current nodes even when their preferred nodes are available. Proposed Fix: Modify the descheduler to support preferredDuringSchedulingIgnoredDuringExecution under nodeAffinityType.Ensure the AffinityAndTaints profile in OpenShift’s descheduler includes this behavior.