-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.12
-
Quality / Stability / Reliability
-
False
-
-
3
-
Moderate
-
No
-
None
-
None
-
None
-
CLOUD Sprint 253, CLOUD Sprint 254, CLOUD Sprint 255, CLOUD Sprint 256, CLOUD Sprint 257, CLOUD Sprint 258, CLOUD Sprint 259, CLOUD Sprint 260, CLOUD Sprint 261, CLOUD Sprint 263, CLOUD Sprint 264, CLOUD Sprint 262, CLOUD Sprint 265, CLOUD Sprint 266
-
14
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
- After upgrading cluster to 4.12 version, nodes are getting scaled up automatically.
- Cu demonstrated by scaling down the machine count from 50 to 45. There are jobs very frequently running in namespace particular namespace and that causes to scaleup the machine count. Same can be found in the events section.
- Although machine scaleup happened, the pod which caused the scale up was still scheduled on other node and it got completed.
- There is no automatic scale down policy.
Workaround:
We have successfully created a temporary priority class that sets the value to -11, and had the application team update their job spec to leverage the new PC, which seems to have successfully stopped the scale-up events so far as expected.