Loading...

XML

Word

Printable

Type: Epic
Resolution: Duplicate
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: openstack-operator
Labels:
None

Epic Name:
operator updates: consider updating operators in sequence to minimize downtime
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Color Status:
Not Selected
Dev Approval:
Proposed
Docs Approval:
Proposed
Epic Status:
Done
Feature Link:
RHOSSTRAT-800 - Provide service Operator reconciliation control
PM Approval:
Proposed
AssignedTeam:
rhos-conplat-core-operators
QE Approval:
Proposed
Intelligence Requested:
Market:

Severity:
Moderate

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

There is concern that OSP services get restarted when operators are updated. Now that we have an initialization resource we could provide more control over how the operator updates proceed. The initialization resource currently updates all operators at once, but we could implement a mechanism to sequence operator updates so that we minimize upgrade impacts ensuring that only 1 OSP service is reconciled at a time.

We already know for example that https://github.com/rabbitmq/cluster-operator/releases/tag/v2.11.0 will cause the RabbitMQ cluster to immediately restart (we aren’t using 2.11 yet but soon will be). Furthermore may lib-common changes to our core labels/annotations or primitive k8s structures would likewise cause a similar restart concern for OSP services immediately upon operator updates. While these updates are normal and expected is is the fact that we are doing them all simultaneously that is a concern here.

account is impacted by

OSPRH-10790 Cut in service availability during update and unable to create vm after update

Closed

is duplicated by

OSPRH-17396 As a Cluster Administrator, I want to use a single field to pause and resume reconciliation for the entire OpenStack environment and have the Operator's status reflect this state, so that I can control when updates are applied and have clear visibility.

Refinement

Assignee:: Dan Prince

Reporter:: Dan Prince

Team:: rhos-conplat-core-operators

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/04/24 6:19 PM

Updated:: 2025/10/01 2:32 PM

Resolved:: 2025/06/16 6:46 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty