-
Epic
-
Resolution: Done
-
Critical
-
None
Epic Goal
- Provide lifecycling functionality to the ManagedClusterAddOn API
Why is this important?
- Add-ons should be able to be independently lifecycled from the hub
- Add-ons should have the ability to be independently versioned both between hub and agent, and between agents deployed on separate managed clusters
- Today, whenever ACM (MultiClusterHub operator) is upgraded, not only is the hub operator upgraded; but a fleet-wide upgrade of all Klusterlet and add-ons are initiated and performed simultaneously. Customers, especially in services and edge scenarios, need finer grained control over the upgrade process to minimize disruption and prevent problems getting introduced into their production environments that would impact their end-users.
Guiding Use Case Exploration Document
Scenarios
- As a user, I can configure a desired version of my add-on to install for a given managed cluster
- If an invalid version is provided, either a webhook should block the configuration or a status should be surfaced that an invalid version was provided
- Discussion:
- On a fresh add-on install in a managed cluster, should a user only be permitted to install the version that matches the hub add-on controller?
- As a user, I can configure a desired version of my add-on to upgrade to for a given managed cluster
- If an invalid version is provided, either a webhook should block the configuration or a status should be surfaced that an invalid version was provided
- The CR status needs to include:
- Current version
- Desired version
- Upgrade status: In progress, failed, succeeded.
- Time upgrade was initiated
- Time upgrade was completed
- Discussion:
- Should the add-on upgrade version option only be the hub add-on controller version? e.g, user can only ever upgrade to the same version as hub.
- Limit by z version, y version?
- Will there be a case where customer's team is managing clusters for multiple tenants, and they may not be ready to upgrade their agent. e.g., today in multi-tenant clusters, operator versions cause tensions across tenants, when upgrades need to be coordinated, etc.
- Should the add-on upgrade version option only be the hub add-on controller version? e.g, user can only ever upgrade to the same version as hub.
- As a user, I can roll-back my add-on to a previous version
- If an invalid version is provided, either a webhook should block the configuration or a status should be surfaced that an invalid version was provided
- The CR status needs to include:
- Current version
- Desired version
- Roll-back status: In progress, failed, succeeded.
- Time upgrade was initiated
- Time upgrade was completed
- Discussion:
- Should the user only be permitted to roll-back in the event of an add-on upgrade failure?
- Should the user only be permitted to roll-back to the original version that they were attempting to upgrade from?
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents.
- All scenarios complete
- Discussion points are settled.
- Open Cluster Management community proposal is submitted and approved.
Dependencies (internal and external)
- ...
Previous Work (Optional):
- …
Open questions::
Questions are listed in scenarios
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>
- blocks
-
ACM-2478 Initial rolling upgrade prototype for Klusterlet and Add-ons
- Closed