-
Story
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
BU Product Work
-
5
-
False
-
None
-
False
-
OCPSTRAT-1823 - [GA] 'oc adm upgrade status' command and status API
-
-
-
OTA 259, OTA 260, OTA 261, OTA 262, OTA 263
On the call to discuss oc adm upgrade status roadmap to server side-implementation (notes) we agreed on basic architectural direction and we can starting moving in that direction:
- status API will be backed by a new controller
- new controller will be a separate binary but delivered in the CVO image (=release payload) to avoid needing new ClusterOperator
- new operator will maintain a singleton resource of a new UpgradeStatus CRD - this is the interface to the consumers
Let's start building this controller; we can implement the controller perform the functionality currently present in the client, and just expose it through an API. I am not sure how to deal with the fact that we won't have the API merged until it merges into o/api, which is not soon. Maybe we can implement the controller over a temporary fork of o/api and rely on manually inserting the CRD into the cluster when we test the functionality? Not sure.
We need to avoid committing to implementation details and investing effort into things that may change though.
Definition of Done
- CVO repository has a new controller (a new cluster-version-operator cobra subcommand sounds like a good option; an alternative would a completely new binary included in CVO image)
- The payload contains manifests (SA, RBAC, Deployment) to deploy the new controller when DevPreviewNoUpgrade feature set is enabled (but not TechPreview)
- The controller uses properly scoped minimal necessary RBAC through a dedicated SA
- The controller will react on ClusterVersion changes in the cluster through an informer
- The controller will maintain a single ClusterVersion status insight as specified by the Update Health API Draft
- The controller does not need to maintain all fields precisely: it can use placeholders or even ignore fields that need more complicated logic over more resources (estimated finish, completion, assessment)
- The controller will publish the serialized CV status insight (in yaml or json) through a ConfigMap (this is a provisionary measure until we can get the necessary API and client-go changes merged) under a key that identifies the kube resource ("cv-version")
- The controller only includes the necessary types code from o/api PR together with the necessary generated code (like deepcopy). These local types will need to be replaced with the types eventually merged into o/api and vendored to o/cluster-version-operator
Testing notes
This card only brings a skeleton of the desired functionality to the DevPreviewNoUpgrade feature set. Its purpose is mainly to enable further development by putting the necessary bits in place so that we can start developing more functionality. There's not much point in automating testing of any of the functionality in this card, but it should be useful to start getting familiar with how the new controller is deployed and what are its concepts.
For seeing the new controller in action:
1. Launch a cluster that includes both the code and manifests. As of Nov 11, #1107 is not yet merged so you need to use launch 4.18,openshift/cluster-version-operator#1107 aws,no-spot
2. Enable the DevPreviewNoUpgrade feature set. CVO will restart and will deploy all functionality gated by this feature set, including the USC. It can take a bit of time, ~10-15m should be enough though.
3. Eventually, you should be able to see the new openshift-update-status-controller Namespace created in the cluster
4. You should be able to see a update-status-controller Deployment in that namespace
5. That Deployment should have one replica running and being ready. It should not crashloop or anything like that. You can inspect its logs for obvious failures and such. At this point, its log should, near its end, say something like "the ConfigMap does not exist so doing nothing"
6. Create the ConfigMap that mimics the future API (make sure to create it in the openshift-update-status-controller namespace): oc create configmap -n openshift-update-status-controller status-api-cm-prototype
7. The controller should immediately-ish insert a usc-cv-version key into the ConfigMap. Its content is a YAML-serialized ClusterVersion status insight (see design doc). As of OTA-1269 the content is not that important, but the (1) reference to the CV (2) versions field should be correct.
8. The status insight should have a condition of Updating type. It should be False at this time (the cluster is not updating).
9. Start upgrading the cluster (it's a cluster bot cluster with ephemeral 4.18 version so you'll need to use --to-image=pullspec and probably force it
10. While updating, you should be able to observe the controller activity in the log (it logs some diffs), but also the content of the status insight in the ConfigMap changing. The versions field should change appropriately (and startedAt too), and the Updating condition should become True.
11. Eventually the update should finish and the Updating condition should flip to False again.
Some of these will turn into automated testcases, but it does not make sense to implement that automation while we're using the ConfigMap instead of the API.
- blocks
-
OTA-1411 USC: Maintain status insights for ClusterOperator resources
- To Do
- is blocked by
-
OCPBUGS-44438 HCP applies featureset-guarded manifests when bootstrapping CVO
- ON_QA
- is triggering
-
OTA-1397 CVO should own its manifest bootstrap/selection logic in HCP
- To Do
- links to