-
Bug
-
Resolution: Done
-
Major
-
4.19, 4.20
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
Proposed
-
CNF Network Sprint 276, CNF Network Sprint 277
-
2
-
In Progress
-
Bug Fix
-
Previously, the time it took to apply updates to the multus-networkpolicy daemonset scaled linearly with node count. This daemonset has been updated to allow for 10% maxUnavailable so that it updates in near constant time in clusters larger than 10 nodes.
-
None
-
None
-
None
-
None
Description of problem:
When upgrading a large (121 node) cluster the rollout of the networking cluster operator takes a significant amount of time. During the update the cluster operator reports status noting "...56 out of 121 updated". This increments through each node on ~10s interval. For all the nodes to roll out this adds ~20 minutes to the upgrade.
Looking at the daemonset "oc get daemonset -n openshift-multus multus-networkpolicy -o yaml" the rollingUpdate strategy has a maxUnavailable of 1 (so one node at a time).
The maxUnavailable should be set to 10% or 30% similar to other openshift component daemonsets.
Version-Release number of selected component (if applicable):
4.19.3 4.20.0-ec.5
How reproducible:
100%
Steps to Reproduce:
1. Upgrade openshift 2. monitor the network cluster operator 3.
Actual results:
serial update of daemonset
Expected results:
higher level of concurrency
Additional info:
- blocks
-
OCPBUGS-61370 multus-networkpolicy daemonset rolls out one node at a time
-
- Verified
-
- is cloned by
-
OCPBUGS-61370 multus-networkpolicy daemonset rolls out one node at a time
-
- Verified
-
- links to