-
Epic
-
Resolution: Unresolved
-
Normal
-
None
-
rhos-18.0.14 FR 4
-
None
-
[Scalability] Enable running multiple instances of applier in a watcher deployment
-
False
-
-
False
-
-
Not Selected
-
?
-
?
-
To Do
-
?
-
rhos-workloads-evolution
-
?
-
0% To Do, 100% In Progress, 0% Done
-
-
-
Goal:
Current behavior when running multiple applier services concurrently is:
- ActionPlans are NOT assigned to a specific applier when created.
- ActionPlans are picked by an applier when started ("self-assigned") not when created.
- If an applier dies while executing an action plan (actionplan will keep as ONGOING), it will be autimatically cancelled when the same applier is started.
This will make an ActionPlan to stay in ONGOING in a scale-down situation, which is an unacceptable behavior. While, in the long term watcher may move to an event-based centralized actionplan processing and per-action dispatching, In the short term we will implement:
- Create a service monitor in the appliers that:
-
- reschedules an AP if it's in Pending
-
- Cancel an AP which is in ongoing (status_message)
Acceptance Criteria:
In an scale-down situation, APs assigned should behave as defined in the previous section. It should be tested in unit tests at least.
Open questions:
- Should we implement the new service_monitor for appliers as part of a single one managing both decision-engines and appliers or create a new one only for the appliers?