-
Epic
-
Resolution: Duplicate
-
Normal
-
None
-
None
-
None
-
operator updates: consider updating operators in sequence to minimize downtime
-
False
-
-
False
-
Not Selected
-
Proposed
-
Proposed
-
Done
-
RHOSSTRAT-800 - Provide service Operator reconciliation control
-
Proposed
-
rhos-conplat-core-operators
-
Proposed
-
-
-
Moderate
There is concern that OSP services get restarted when operators are updated. Now that we have an initialization resource we could provide more control over how the operator updates proceed. The initialization resource currently updates all operators at once, but we could implement a mechanism to sequence operator updates so that we minimize upgrade impacts ensuring that only 1 OSP service is reconciled at a time.
We already know for example that https://github.com/rabbitmq/cluster-operator/releases/tag/v2.11.0 will cause the RabbitMQ cluster to immediately restart (we aren’t using 2.11 yet but soon will be). Furthermore may lib-common changes to our core labels/annotations or primitive k8s structures would likewise cause a similar restart concern for OSP services immediately upon operator updates. While these updates are normal and expected is is the fact that we are doing them all simultaneously that is a concern here.
- account is impacted by
-
OSPRH-10790 Cut in service availability during update and unable to create vm after update
-
- Closed
-
- is duplicated by
-
OSPRH-17396 As a Cluster Administrator, I want to use a single field to pause and resume reconciliation for the entire OpenStack environment and have the Operator's status reflect this state, so that I can control when updates are applied and have clear visibility.
-
- Refinement
-