1. Proposed title of this feature request
Multiple machineconfigs (matchExpression) state: Ready
2. What is the nature and description of the request?
Customer noticed that during a node scaling the new node's state transitions in the following sequence:
NotReady >> Ready >> Ready,SchedulingDisabled >> NotReady >> Ready
And wants to improve it as they believe the first Ready transition should not happen. In a cluster being under high pressure, this logic results the scheduler to schedule pods on the new node despite the fact that the second rendered config has not yet been applied.
So it should follow sequence -> NotReady >> Ready,SchedulingDisabled >> NotReady >> Ready
What is the business impact? Please also provide timeframe information.
Financial because multiple pods restart multiple times
When does the behavior occur? Frequency? Repeatedly? At certain times?
Each time when the autoscaler runs.
Platform (AWS, VSphere, Metal, etc.):
Are you certain that the root cause of the issue being reported is the MCO (Machine Config Operator)?
(Y/N/Not sure):
Not sure
Attaching more info in private comment !!
3. Why does the customer need this? (List the business requirements here)
Financial because multiple pods restart multiple timesĀ
4. List any affected packages or components.
MCP, Nodes
- is related to
-
MCO-205 [Spike] Preventing custom pool race conditions
- To Do
- links to