-
Spike
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
BU Product Work
-
5
-
False
-
None
-
False
-
OCPSTRAT-417 - Hypershift update phase 2 - Conditional updates should work for self-managed hosted control planes (HCP)
-
-
-
OTA 233, OTA 234
Like IBM's ROKS, HyperShift's hosted-control-plane controller manages the cluster-version operator's Deployment directly, based on a release config propagated from HostedCluster. That means that the outgoing CVO is not aware that it's about to be updated, which means it doesn't perform the usual outgoing-to-incoming update handoff tasks like:
- Validating a signature for the target release image.
- Verifying Upgradeable and other preconditions to see if the update seems safe.
- Checking to see if any ClusterVersion capabilities should be implicitly enabled, by comparing the outgoing and incoming manifest sets.
- Storing any concerns (e.g. matching update risks) in acceptedRisks.
The main motivation for the current ROKS/HyperShift approach is that the CVO runs on the central management cluster, with its Deployment in the hosted etcd. But the ClusterVersion is inside the hosted etcd, which the customer with hosted-cluster access can reach, and possibly interfere with. It's not clear to me why ClusterVersion is in the hosted cluster, or what hosted-cluster components consume it there.
This ticket is about planning short- and possibly also long-term approaches to address these issues.
Possible options:
- Teach the CVO to be flexible enough for HyperShift, with an option to have two kubeconfigs: one to manage its own Deployment and a ClusterVersion and such living in the management cluster (this would involve moving the ClusterVersion resource from the hosted cluster to the management cluster). And another kubeconfig to manage the hosted-cluster components living in the hosted cluster.
- Leave things pretty much as they are today, but figure out some other way to handle implicit capability enablement. For example, instead of having the outgoing CVO compare outgoing and incoming manifests, have the outgoing CVO just shut down. And have the incoming CVO hunt for previously managed resources via ownerReferences, although we'd probably want to be more assertive about setting those than we are today.
- Presumably lots of other options. Also somewhat dependent on how
OTA-797works out.
So far, HyperShift does not allow ClusterVersion capability configuration. And all components being moved into capabilities have been added to vCurrent. But we'll want to have at least a short-term plan in place here before HyperShift adds capability support, and also before we move an existing OCP-core component into a capability that we hold out of vCurrent.
Definition of Done:
- create follow-up cards
- is related to
-
OTA-821 How comfortable are we with channel-clearing?
- To Do
- relates to
-
OTA-951 CVO to drive the hypershift hosted control plane
- New
-
HOSTEDCP-663 Close function gap betwen cluster-etcd-operator and HyperShift control plane operator for etcd
- New