Description of problem:
usually providing a cluster with unaccepted update, such as unsigned payload without force, is treated with releaseaccepted=false progressing=false. however by scaling cvo deployment down and up again, progressing=true is observed, causing oc adm upgrade as well as oc adm upgrade status to display incorrect information, and clusterversion object to display empty capabilities and history item with version ""
Version-Release number of selected component (if applicable):
4.16.0-rc.4 but observed as well as early as 4.10.67
How reproducible:
100%
Steps to Reproduce:
1. target the cluster at unsigned build without using force ❯ oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a 2. scale cvo down and up again ❯ oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator deployment.apps/cluster-version-operator scaled ❯ oc scale --replicas 1 -n openshift-cluster-version deployments/cluster-version-operator deployment.apps/cluster-version-operator scaled
Actual results:
oc adm update displays "info: An upgrade is in progress. Working towards..."
also a warning of "Architecture has not been configured"
❯ oc adm upgrade info: An upgrade is in progress. Working towards registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a ReleaseAccepted=False Reason: RetrievePayload Message: Retrieving payload failed version="" image="registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a" failure=The update cannot be verified: unable to verify sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a against keyrings: verifier-public-key-redhat Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.16 warning: Cannot display available updates: Reason: NoArchitecture Message: Architecture has not been configured.
clusterversion object have Progressing True, "capabilities: {}" as well as a partial history item with version ""
❯ oc get clusterversion version -oyaml apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2024-06-10T11:36:51Z" generation: 3 name: version resourceVersion: "70199" uid: 9c80848b-9f3a-4f0d-8472-a2ccce1c4023 spec: channel: stable-4.16 clusterID: e74054ac-e0fe-4cf7-a457-4887ba96cff9 desiredUpdate: architecture: "" force: false image: registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a version: "" status: availableUpdates: null capabilities: {} conditions: - lastTransitionTime: "2024-06-10T11:37:17Z" message: Architecture has not been configured. reason: NoArchitecture status: "False" type: RetrievedUpdates - lastTransitionTime: "2024-06-10T11:37:17Z" message: Capabilities match configured spec reason: AsExpected status: "False" type: ImplicitlyEnabledCapabilities - lastTransitionTime: "2024-06-10T14:06:42Z" message: 'Retrieving payload failed version="" image="registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a" failure=The update cannot be verified: unable to verify sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a against keyrings: verifier-public-key-redhat' reason: RetrievePayload status: "False" type: ReleaseAccepted - lastTransitionTime: "2024-06-10T12:06:31Z" message: Done applying 4.16.0-rc.4 status: "True" type: Available - lastTransitionTime: "2024-06-10T12:06:31Z" status: "False" type: Failing - lastTransitionTime: "2024-06-10T14:07:30Z" message: Working towards registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a status: "True" type: Progressing desired: image: registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a version: "" history: - completionTime: null image: registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a startedTime: "2024-06-10T14:07:30Z" state: Partial verified: false version: "" - completionTime: "2024-06-10T12:06:31Z" image: quay.io/openshift-release-dev/ocp-release@sha256:6c236c400d3bad9b2b54d8a3b247c508f6f13511d37666de1eecca8e43bce0f6 startedTime: "2024-06-10T11:37:17Z" state: Completed verified: false version: 4.16.0-rc.4 observedGeneration: 3 versionHash: AjnKTa_3kbg=
in upgrade status, Progressing to an empty target with Completion 0%
= Control Plane = Assessment: Progressing Target Version: (from 4.16.0-rc.4) Completion: 0% Duration: 2m26.971091165s Operator Status: 33 Healthy
Expected results:
clusterversion stays the same as before scale toggle
apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2024-06-10T11:36:51Z" generation: 3 name: version resourceVersion: "69881" uid: 9c80848b-9f3a-4f0d-8472-a2ccce1c4023 spec: channel: stable-4.16 clusterID: e74054ac-e0fe-4cf7-a457-4887ba96cff9 desiredUpdate: architecture: "" force: false image: registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a version: "" status: availableUpdates: null capabilities: enabledCapabilities: - Build - CSISnapshot - CloudControllerManager - CloudCredential - Console - DeploymentConfig - ImageRegistry - Ingress - Insights - MachineAPI - NodeTuning - OperatorLifecycleManager - Storage - baremetal - marketplace - openshift-samples knownCapabilities: - Build - CSISnapshot - CloudControllerManager - CloudCredential - Console - DeploymentConfig - ImageRegistry - Ingress - Insights - MachineAPI - NodeTuning - OperatorLifecycleManager - Storage - baremetal - marketplace - openshift-samples conditions: - lastTransitionTime: "2024-06-10T11:37:17Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.16.0-rc.4 not found in the "stable-4.16" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates - lastTransitionTime: "2024-06-10T11:37:17Z" message: Capabilities match configured spec reason: AsExpected status: "False" type: ImplicitlyEnabledCapabilities - lastTransitionTime: "2024-06-10T14:06:42Z" message: 'Retrieving payload failed version="" image="registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a" failure=The update cannot be verified: unable to verify sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a against keyrings: verifier-public-key-redhat' reason: RetrievePayload status: "False" type: ReleaseAccepted - lastTransitionTime: "2024-06-10T12:06:31Z" message: Done applying 4.16.0-rc.4 status: "True" type: Available - lastTransitionTime: "2024-06-10T12:06:31Z" status: "False" type: Failing - lastTransitionTime: "2024-06-10T12:06:31Z" message: Cluster version is 4.16.0-rc.4 status: "False" type: Progressing desired: image: quay.io/openshift-release-dev/ocp-release@sha256:6c236c400d3bad9b2b54d8a3b247c508f6f13511d37666de1eecca8e43bce0f6 url: https://access.redhat.com/errata/RHEA-2024:0041 version: 4.16.0-rc.4 history: - completionTime: "2024-06-10T12:06:31Z" image: quay.io/openshift-release-dev/ocp-release@sha256:6c236c400d3bad9b2b54d8a3b247c508f6f13511d37666de1eecca8e43bce0f6 startedTime: "2024-06-10T11:37:17Z" state: Completed verified: false version: 4.16.0-rc.4 observedGeneration: 2 versionHash: AjnKTa_3kbg=
no upgrade is in progress message for release that is not accepted
❯ oc adm upgrade Cluster version is 4.16.0-rc.4 ReleaseAccepted=False Reason: RetrievePayload Message: Retrieving payload failed version="" image="registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a" failure=The update cannot be verified: unable to verify sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a against keyrings: verifier-public-key-redhat Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.16 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.16.0-rc.4 not found in the "stable-4.16" channel
Additional info:
it is possible to kick the cluster out of this state, by applying --clear, which causing the cluster to breefly progress into its original version, followed by 3 items appearing in history
❯ oc adm upgrade --clear Cleared the update field, still at registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a ❯ oc adm upgrade info: An upgrade is in progress. Working towards 4.16.0-rc.4: 116 of 894 done (12% complete) Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.16 warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.16.0-rc.4 not found in the "stable-4.16" channel
❯ oc get clusterversion version -oyaml apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2024-06-10T11:36:51Z" generation: 4 name: version resourceVersion: "72594" uid: 9c80848b-9f3a-4f0d-8472-a2ccce1c4023 spec: channel: stable-4.16 clusterID: e74054ac-e0fe-4cf7-a457-4887ba96cff9 status: availableUpdates: null capabilities: enabledCapabilities: - Build - CSISnapshot - CloudControllerManager - CloudCredential - Console - DeploymentConfig - ImageRegistry - Ingress - Insights - MachineAPI - NodeTuning - OperatorLifecycleManager - Storage - baremetal - marketplace - openshift-samples knownCapabilities: - Build - CSISnapshot - CloudControllerManager - CloudCredential - Console - DeploymentConfig - ImageRegistry - Ingress - Insights - MachineAPI - NodeTuning - OperatorLifecycleManager - Storage - baremetal - marketplace - openshift-samples conditions: - lastTransitionTime: "2024-06-10T11:37:17Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.16.0-rc.4 not found in the "stable-4.16" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates - lastTransitionTime: "2024-06-10T11:37:17Z" message: Capabilities match configured spec reason: AsExpected status: "False" type: ImplicitlyEnabledCapabilities - lastTransitionTime: "2024-06-10T14:13:07Z" message: Payload loaded version="4.16.0-rc.4" image="quay.io/openshift-release-dev/ocp-release@sha256:6c236c400d3bad9b2b54d8a3b247c508f6f13511d37666de1eecca8e43bce0f6" architecture="amd64" reason: PayloadLoaded status: "True" type: ReleaseAccepted - lastTransitionTime: "2024-06-10T12:06:31Z" message: Done applying 4.16.0-rc.4 status: "True" type: Available - lastTransitionTime: "2024-06-10T12:06:31Z" status: "False" type: Failing - lastTransitionTime: "2024-06-10T14:14:00Z" message: Cluster version is 4.16.0-rc.4 status: "False" type: Progressing desired: image: quay.io/openshift-release-dev/ocp-release@sha256:6c236c400d3bad9b2b54d8a3b247c508f6f13511d37666de1eecca8e43bce0f6 url: https://access.redhat.com/errata/RHEA-2024:0041 version: 4.16.0-rc.4 history: - completionTime: "2024-06-10T14:14:00Z" image: quay.io/openshift-release-dev/ocp-release@sha256:6c236c400d3bad9b2b54d8a3b247c508f6f13511d37666de1eecca8e43bce0f6 startedTime: "2024-06-10T14:13:07Z" state: Completed verified: false version: 4.16.0-rc.4 - completionTime: "2024-06-10T14:13:07Z" image: registry.ci.openshift.org/ocp/release@sha256:36cfa8cebb86ded6e1d51c308d31eb7b2c2e7705a0df6f698c690b6fba8b7e7a startedTime: "2024-06-10T14:07:30Z" state: Partial verified: false version: "" - completionTime: "2024-06-10T12:06:31Z" image: quay.io/openshift-release-dev/ocp-release@sha256:6c236c400d3bad9b2b54d8a3b247c508f6f13511d37666de1eecca8e43bce0f6 startedTime: "2024-06-10T11:37:17Z" state: Completed verified: false version: 4.16.0-rc.4 observedGeneration: 4 versionHash: AjnKTa_3kbg=
also trying to apply a rollback at this state, resulting in invalid SemVer error
❯ OC_ENABLE_CMD_UPGRADE_ROLLBACK=true oc adm upgrade rollback error: previous version "" invalid SemVer: Version string empty
- blocks
-
OCPBUGS-42386 cvo trying to progress unaccepted release following scale toggle
- Closed
- is cloned by
-
OCPBUGS-42386 cvo trying to progress unaccepted release following scale toggle
- Closed
- is related to
-
OCPBUGS-43043 After Upgradng to 4.15 ServiceAccount secrets for image registry were deleted even though ImageRegistry was set to `Managed`
- New
-
OCPBUGS-22266 OpenShift 4.14 Upgrade with baselineCapabilties: None leaves cluster operators behind on lower versions
- ASSIGNED
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update