[OCPBUGS-19452] DaemonSet fails to scale down during the rolling update when maxUnavailable=0 - Red Hat Issue Tracker

Type: Bug
Resolution: Done-Errata
Priority: Critical
Fix Version/s: 4.15.0
Affects Version/s: 4.13
Component/s: kube-controller-manager
Labels:
None

Severity:
Important
Regression:
No
Release Blocker:
Approved
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:

Hide
---------- edited for release notes ----------
* Previously, when the `maxSurge` field is set for a daemon set and the toleration is updated, pods fail to scale down which can result in a failed rollout due to a different set of nodes being used for scheduling. With this release, nodes are properly excluded if scheduling constraints are not met, and rollouts can complete successfully. (link:https://issues.redhat.com/browse/OCPBUGS-19452[*~~OCPBUGS-19452~~*])
---------- original text ----------
Cause: DaemonSet with maxSurge fails to scale down pods when a toleration are updated, resulting in a different set of nodes being used for scheduling.
Consequence: Incorrect scheduling of DaemonSet of pods which can result in a failed rollout.
Fix: Revised the logic for DaemonSet rolling update to exclude nodes if scheduling constraints are not met.
Result: This eliminates the problem of rolling updates to a DaemonSet getting stuck around tolerations.

Show
---------- edited for release notes ---------- * Previously, when the `maxSurge` field is set for a daemon set and the toleration is updated, pods fail to scale down which can result in a failed rollout due to a different set of nodes being used for scheduling. With this release, nodes are properly excluded if scheduling constraints are not met, and rollouts can complete successfully. (link: https://issues.redhat.com/browse/OCPBUGS-19452 [* OCPBUGS-19452 *]) ---------- original text ---------- Cause: DaemonSet with maxSurge fails to scale down pods when a toleration are updated, resulting in a different set of nodes being used for scheduling. Consequence: Incorrect scheduling of DaemonSet of pods which can result in a failed rollout. Fix: Revised the logic for DaemonSet rolling update to exclude nodes if scheduling constraints are not met. Result: This eliminates the problem of rolling updates to a DaemonSet getting stuck around tolerations.
Release Note Type:
Bug Fix
Release Note Status:
Done
Target Version:

4.15.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

The OpenShift DNS daemonset has the rolling update strategy. The "maxSurge" parameter is set to a non zero value which means that the "maxUnavailable" parameter is set to zero. When the user replaces the toleration in the daemonset's template spec (via the OpenShift DNS config API) from the one which helps to be scheduled on the master node into any other toleration: the new pods are still trying to be scheduled on the master nodes. The old pods from the tolerated nodes can be lucky enough to be recreated but only if they go before any pod from the intolerable node.

The new pods are not expected to be scheduled on the nodes which are not tolerated by the new damonset's template spec. The daemonset controller should just delete the old pods from the nodes which cannot be tolerated anymore. The old pods from the nodes which can still be tolerated should be recreated according to the rolling update parameters.

Version-Release number of selected component (if applicable):

How reproducible:

Always

Steps to Reproduce:
1. Create the daemonset which tolerates "node-role.kubernetes.io/master" taint and has the following rolling update parameters:

$ oc -n openshift-dns get ds dns-default -o yaml | yq .spec.updateStrategy
rollingUpdate:
  maxSurge: 10%
  maxUnavailable: 0
type: RollingUpdate

$ oc  -n openshift-dns get ds dns-default -o yaml | yq .spec.template.spec.tolerations
- key: node-role.kubernetes.io/master
  operator: Exists

2. Let the daemonset to be scheduled on all the target nodes (e.g. all masters and all workers)

$ oc -n openshift-dns get pods  -o wide | grep dns-default
dns-default-6bfmf     2/2     Running   0          119m    10.129.0.40   ci-ln-sb5ply2-72292-qlhc8-master-2         <none>           <none>
dns-default-9cjdf     2/2     Running   0          2m35s   10.129.2.15   ci-ln-sb5ply2-72292-qlhc8-worker-c-m5wzq   <none>           <none>
dns-default-c6j9x     2/2     Running   0          119m    10.128.0.13   ci-ln-sb5ply2-72292-qlhc8-master-0         <none>           <none>
dns-default-fhqrs     2/2     Running   0          2m12s   10.131.0.29   ci-ln-sb5ply2-72292-qlhc8-worker-a-6q7hs   <none>           <none>
dns-default-lx2nf     2/2     Running   0          119m    10.130.0.15   ci-ln-sb5ply2-72292-qlhc8-master-1         <none>           <none>
dns-default-mmc78     2/2     Running   0          112m    10.128.2.7    ci-ln-sb5ply2-72292-qlhc8-worker-b-bpjdk   <none>           <none>

3. Update the daemonset's tolerations by removing "node-role.kubernetes.io/master" and adding any other toleration (not existing works too):

$ oc -n openshift-dns get ds dns-default -o yaml | yq .spec.template.spec.tolerations
- key: test-taint
  operator: Exists

Actual results:

$ oc -n openshift-dns get pods  -o wide | grep dns-default
dns-default-6bfmf     2/2     Running   0          124m    10.129.0.40   ci-ln-sb5ply2-72292-qlhc8-master-2         <none>           <none>
dns-default-76vjz     0/2     Pending   0          3m2s    <none>        <none>                                     <none>           <none>
dns-default-9cjdf     2/2     Running   0          7m24s   10.129.2.15   ci-ln-sb5ply2-72292-qlhc8-worker-c-m5wzq   <none>           <none>
dns-default-c6j9x     2/2     Running   0          124m    10.128.0.13   ci-ln-sb5ply2-72292-qlhc8-master-0         <none>           <none>
dns-default-fhqrs     2/2     Running   0          7m1s    10.131.0.29   ci-ln-sb5ply2-72292-qlhc8-worker-a-6q7hs   <none>           <none>
dns-default-lx2nf     2/2     Running   0          124m    10.130.0.15   ci-ln-sb5ply2-72292-qlhc8-master-1         <none>           <none>
dns-default-mmc78     2/2     Running   0          117m    10.128.2.7    ci-ln-sb5ply2-72292-qlhc8-worker-b-bpjdk   <none>           <none>

Expected results:

$ oc -n openshift-dns get pods  -o wide | grep dns-default
dns-default-9cjdf     2/2     Running   0          7m24s   10.129.2.15   ci-ln-sb5ply2-72292-qlhc8-worker-c-m5wzq   <none>           <none>
dns-default-fhqrs     2/2     Running   0          7m1s    10.131.0.29   ci-ln-sb5ply2-72292-qlhc8-worker-a-6q7hs   <none>           <none>
dns-default-mmc78     2/2     Running   0          7m54s   10.128.2.7    ci-ln-sb5ply2-72292-qlhc8-worker-b-bpjdk   <none>           <none>

Additional info:
Upstream issue: https://github.com/kubernetes/kubernetes/issues/118823
Slack discussion: https://redhat-internal.slack.com/archives/CKJR6200N/p1687455135950439

clones

OCPBUGS-15531 [4.14] DaemonSet fails to scale down during the rolling update when maxUnavailable=0

Closed

is depended on by

OCPBUGS-15531 [4.14] DaemonSet fails to scale down during the rolling update when maxUnavailable=0

Closed

links to

openshift/kubernetes#1716: OCPBUGS-19452: UPSTREAM: 119317: change rolling update logic to exclude sunsetting nodes

RHEA-2023:7198 rpm

Assignee:: Filip Krepinsky

Reporter:: Andrey Lebedev

QA Contact:: Ying Zhou

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2023/09/20 10:59 AM

Updated:: 2024/02/27 8:56 PM

Resolved:: 2024/02/27 8:56 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates