-
Feature
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
Product / Portfolio Work
-
None
-
0% To Do, 100% In Progress, 0% Done
-
False
-
-
False
-
L
-
None
-
None
-
None
-
-
None
-
None
-
None
-
None
Feature Overview (aka. Goal Summary)
This feature is intended to reduce the time and effort for cluster administrators who manage many clusters, enabling them to pre-approve certain risks that they deem acceptable across some or all of their environments
A cluster admin can express accepted risks for a cluster so that when a conditional update of cluster is requested,
it can be accepted only if the risks associated with the conditional update are all accepted.
Goals (aka. expected user outcomes)
The feature involves changes to the ClusterVersion API, where accepted risks can be listed in the spec.desiredUpdate field.It also includes updates to command-line tools like oc-cli, introducing commands such as oc adm upgrade accept-risks or flags like --accepted for oc adm upgrade recommend (to be decided later)
The goal is to reduce the time for a cluster admin to evaluate the risks for conditional updates when there are a great number of clusters to manage.
Consider the following scenario:
- A cluster admin gets conditional updates of a cluster and evaluates the risks in the conditional updates.
- After evaluation, the admin decides that some risks can be accepted on all or some of the managed clusters.
- The administrator configures relevant clusters to allow for updates, as long as any identified risks have all been accepted.
Requirements (aka. Acceptance Criteria):
There is a way for a cluster admin to express accepted risks
so that a conditional update to a cluster is accepted only if all its associated risks to the conditional update are accepted.
For example, if a cluster admin accepts RiskA and RiskB and requests an update to a new version:
- The update is accepted if the update is recommended.
- The update is accepted if the cluster thought it was only exposed to RiskA.
- The update is accepted if the cluster thought it was only exposed to RiskA and RiskB.
- The update is rejected if the cluster thought it was only exposed to RiskC.
- The update is rejected if the cluster thought it was only exposed to RiskA and RiskC.
Note that there might be other criteria rejecting the update but they are not relevant to the update risks and thus out of the scope of the discussion for this card.
Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.
Deployment considerations | List applicable specific needs (N/A = not applicable) |
Self-managed, managed, or both | Self-managed. Managed may opt to consume if they want, if/when this feature goes GA. |
Classic (standalone cluster) | Yes |
Hosted control planes | We'll need to lift this to HostedCluster status eventually (like hypershift#1954 did for other properties), but that can be a follow-up Feature |
Multi node, Compact (three node), or Single node (SNO), or all | All, but not MicroShift, which doesn't have ClusterVersion or a CVO (it updates via RPMs). |
Connected / Restricted Network | All |
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) | All |
Operator compatibility | Not applicable |
Backport needed (list applicable versions) | Not applicable |
UI need (e.g. OpenShift Console, dynamic plugin, OCM) | Eventually, but not for this TechPreview Feature. It can be follow-up work. |
Other (please specify) |
Use Cases (Optional):
Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.
As described in the Goals section, the intention is to make it easier to manage updates across a fleet of many clusters, by centralizing update-risk review, and allowing central decisions about acceptable risks to be pushed out to the managed cluster fleet.
Questions to Answer (Optional):
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
<your text here>
Out of Scope
- HyperShift (future work)
- Promoting from TechPreview to GA (future work)
- UI/console exposure (future work)
- Slugs for risks beyond those declared in the OpenShift Update Service (possible future work)
- Any changes to how clusters act once an update has been accepted.
- ACM/OCM/managed integration. They're unlikely to use this feature until it's GA, but would likely benefit by consuming this feature once that happens, so they may be interested in providing earlier review/feedback.
Background
Provide any additional context is needed to frame the feature. Initial completion during Refinement status.
<your text here>
Customer Considerations
<your text here>
Documentation Considerations
Add docs for accepted risks
Interoperability Considerations
Which other projects, including ROSA/OSD/ARO, and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.
<your text here>
- blocks
-
OTA-1452 ClusterVersion treating Upgradeable as a conditional risk
-
- New
-
- clones
-
OCPSTRAT-1834 [Tech Preview] OCP Update Precheck command to improve update experience
-
- Closed
-
- is related to
-
OTA-1540 Tech Preview: 'oc adm upgrade recommend' accepted-risk handling
-
- Closed
-
-
OTA-1532 Tech Preview: New '--accepted-risk' argument for 'oc adm upgrade recommend'
-
- Closed
-
- relates to
-
RFE-7037 Enhance the --force argument during OCP upgrades to prevent an inconsistent state
-
- Closed
-
- links to