-
Feature
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
Product / Portfolio Work
-
-
False
-
-
False
-
L
-
None
-
GA
-
None
-
None
-
-
None
-
None
-
None
-
None
Feature Overview (aka. Goal Summary)
This feature is intended to reduce the time and effort for cluster administrators who manage many clusters, enabling them to pre-approve certain risks that they deem acceptable across some or all of their environments. It builds on the tech-preview OCPSTRAT-2118 by moving that functionality to GA.
A cluster admin can express accepted risks for a cluster so that when a conditional update of cluster is requested,
it can be accepted only if the risks associated with the conditional update are all accepted.
Goals (aka. expected user outcomes)
The feature involves changes to the ClusterVersion API, where accepted risks can be listed in the spec.desiredUpdate field.It also includes updates to command-line tools like oc-cli, introducing commands such as oc adm upgrade accept-risks or flags like --accepted for oc adm upgrade recommend (to be decided later). It also involves web-console integration.
The goal is to reduce the time for a cluster admin to evaluate the risks for conditional updates when there are a great number of clusters to manage.
Consider the following scenario:
- A cluster admin gets conditional updates of a cluster and evaluates the risks in the conditional updates.
- After evaluation, the admin decides that some risks can be accepted on all or some of the managed clusters.
- The administrator configures relevant clusters to allow for updates, as long as any identified risks have all been accepted.
Requirements (aka. Acceptance Criteria)
There is a way for a cluster admin to express accepted risks
so that a conditional update to a cluster is accepted only if all its associated risks to the conditional update are accepted.
For example, if a cluster admin accepts RiskA and RiskB and requests an update to a new version:
- The update is accepted if the update is recommended.
- The update is accepted if the cluster thought it was only exposed to RiskA.
- The update is accepted if the cluster thought it was only exposed to RiskA and RiskB.
- The update is rejected if the cluster thought it was only exposed to RiskC.
- The update is rejected if the cluster thought it was only exposed to RiskA and RiskC.
Note that there might be other criteria rejecting the update but they are not relevant to the update risks and thus out of the scope of the discussion for this card.
| Deployment considerations | List applicable specific needs (N/A = not applicable) |
| Self-managed, managed, or both | Self-managed. Managed may opt to consume if they want |
| Classic (standalone cluster) | Yes |
| Hosted control planes | We'll need to lift this to HostedCluster status eventually (like hypershift#1954 did for other properties), but that can be a follow-up Feature |
| Multi node, Compact (three node), or Single node (SNO), or all | All, but not MicroShift, which doesn't have ClusterVersion or a CVO (it updates via RPMs). |
| Connected / Restricted Network | All |
| Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) | All |
| Operator compatibility | Not applicable |
| Backport needed (list applicable versions) | Not applicable |
| UI need (e.g. OpenShift Console, dynamic plugin, OCM) | Console integration for this generally-available feature |
| Other (please specify) |
Use Cases
As described in the Goals section, the intention is to make it easier to manage updates across a fleet of many clusters, by centralizing update-risk review, and allowing central decisions about acceptable risks to be pushed out to the managed cluster fleet.
Questions to Answer
Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.
<your text here>
Out of Scope
- HyperShift (future work)
- Any changes to how clusters act once an update has been accepted.
- ACM/OCM/managed integration. They're unlikely to use this feature until it's GA, but would likely benefit by consuming this feature once that happens, so they may be interested in providing earlier review/feedback.
Background
Currently update advice is jumbled together in human-facing Upgradeable condition messages or per-target-version conditional update messages. This feature provides a more structured representation of that data, so it's easier to interact with it in more automated/distributed ways, beyond "the human requesting the update has all the knowledge they need to understand complicated messaging and make risk decisions". For example, a new-hire cluster admin with little context could take relevant risk slugs to a central, experienced team for an accept-or-not decision.
Customer Considerations
I don't understand what this section is about.
Documentation Considerations
GA feature. Will need docs. Definitely https://docs.redhat.com/en/documentation/openshift_container_platform/4.20/html/updating_clusters/performing-a-cluster-update . Possibly in other places.
Interoperability Considerations
ARO will get this for free, although they might need to update their documentation to allow their customers to access the functionality.
ROSA/OSD would need OCM changes, and nobody's reached out to the OCM folks yet to ask if they're interested.
- blocks
-
OCPSTRAT-2815 Version-specific Cluster Upgrade Preflight Checks
-
- New
-
- clones
-
OCPSTRAT-2118 TechPreview: Accepted Risks for OCP Cluster Updates
-
- In Progress
-
- is blocked by
-
OCPSTRAT-2118 TechPreview: Accepted Risks for OCP Cluster Updates
-
- In Progress
-