Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-25673

CNV upgrades from v4.14.1 to v4.15.0 (unreleased) are not starting due to out of sync operatorCondition

XMLWordPrintable

    • No
    • Approved
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

      Description of problem:

      CNV upgrades from v4.14.1 to v4.15.0 (unreleased) are not starting due to out of sync operatorCondition.

      We see:

      $ oc get csv
      NAME                                       DISPLAY                    VERSION               REPLACES                                   PHASE
      kubevirt-hyperconverged-operator.v4.14.1   OpenShift Virtualization   4.14.1                kubevirt-hyperconverged-operator.v4.14.0   Replacing
      kubevirt-hyperconverged-operator.v4.15.0   OpenShift Virtualization   4.15.0                kubevirt-hyperconverged-operator.v4.14.1   Pending
      

      And on the v4.15.0 CSV:

      $ oc get csv kubevirt-hyperconverged-operator.v4.15.0 -o yaml
      ....
      status:
        cleanup: {}
        conditions:
        - lastTransitionTime: "2023-12-19T01:50:48Z"
          lastUpdateTime: "2023-12-19T01:50:48Z"
          message: requirements not yet checked
          phase: Pending
          reason: RequirementsUnknown
        - lastTransitionTime: "2023-12-19T01:50:48Z"
          lastUpdateTime: "2023-12-19T01:50:48Z"
          message: 'operator is not upgradeable: the operatorcondition status "Upgradeable"="True"
            is outdated'
          phase: Pending
          reason: OperatorConditionNotUpgradeable
        lastTransitionTime: "2023-12-19T01:50:48Z"
        lastUpdateTime: "2023-12-19T01:50:48Z"
        message: 'operator is not upgradeable: the operatorcondition status "Upgradeable"="True"
          is outdated'
        phase: Pending
        reason: OperatorConditionNotUpgradeable
      

      and if we check the pending operator condition (v4.14.1) we see:

      $ oc get operatorcondition kubevirt-hyperconverged-operator.v4.14.1 -o yaml
      apiVersion: operators.coreos.com/v2
      kind: OperatorCondition
      metadata:
        creationTimestamp: "2023-12-16T17:10:17Z"
        generation: 18
        labels:
          operators.coreos.com/kubevirt-hyperconverged.openshift-cnv: ""
        name: kubevirt-hyperconverged-operator.v4.14.1
        namespace: openshift-cnv
        ownerReferences:
        - apiVersion: operators.coreos.com/v1alpha1
          blockOwnerDeletion: false
          controller: true
          kind: ClusterServiceVersion
          name: kubevirt-hyperconverged-operator.v4.14.1
          uid: 7db79d4b-e69e-4af8-9335-6269cf004440
        resourceVersion: "4116127"
        uid: 347306c9-865a-42b8-b2c9-69192b0e350a
      spec:
        conditions:
        - lastTransitionTime: "2023-12-18T18:47:23Z"
          message: ""
          reason: Upgradeable
          status: "True"
          type: Upgradeable
        deployments:
        - hco-operator
        - hco-webhook
        - hyperconverged-cluster-cli-download
        - cluster-network-addons-operator
        - virt-operator
        - ssp-operator
        - cdi-operator
        - hostpath-provisioner-operator
        - mtq-operator
        serviceAccounts:
        - hyperconverged-cluster-operator
        - cluster-network-addons-operator
        - kubevirt-operator
        - ssp-operator
        - cdi-operator
        - hostpath-provisioner-operator
        - mtq-operator
        - cluster-network-addons-operator
        - kubevirt-operator
        - ssp-operator
        - cdi-operator
        - hostpath-provisioner-operator
        - mtq-operator
      status:
        conditions:
        - lastTransitionTime: "2023-12-18T09:41:06Z"
          message: ""
          observedGeneration: 11
          reason: Upgradeable
          status: "True"
          type: Upgradeable
      

      where metadata.generation (18) is not in sync with status.conditions[*].observedGeneration (11).

      Even manually redacting spec.conditions.lastTransitionTime is causing a change in metadata.generation (as expected) but this doesn't trigger any reconciliation on the OLM and so status.conditions[*].observedGeneration remains at 11.

      $ oc get operatorcondition kubevirt-hyperconverged-operator.v4.14.1 -o yaml
      apiVersion: operators.coreos.com/v2
      kind: OperatorCondition
      metadata:
        creationTimestamp: "2023-12-16T17:10:17Z"
        generation: 19
        labels:
          operators.coreos.com/kubevirt-hyperconverged.openshift-cnv: ""
        name: kubevirt-hyperconverged-operator.v4.14.1
        namespace: openshift-cnv
        ownerReferences:
        - apiVersion: operators.coreos.com/v1alpha1
          blockOwnerDeletion: false
          controller: true
          kind: ClusterServiceVersion
          name: kubevirt-hyperconverged-operator.v4.14.1
          uid: 7db79d4b-e69e-4af8-9335-6269cf004440
        resourceVersion: "4147472"
        uid: 347306c9-865a-42b8-b2c9-69192b0e350a
      spec:
        conditions:
        - lastTransitionTime: "2023-12-18T18:47:25Z"
          message: ""
          reason: Upgradeable
          status: "True"
          type: Upgradeable
        deployments:
        - hco-operator
        - hco-webhook
        - hyperconverged-cluster-cli-download
        - cluster-network-addons-operator
        - virt-operator
        - ssp-operator
        - cdi-operator
        - hostpath-provisioner-operator
        - mtq-operator
        serviceAccounts:
        - hyperconverged-cluster-operator
        - cluster-network-addons-operator
        - kubevirt-operator
        - ssp-operator
        - cdi-operator
        - hostpath-provisioner-operator
        - mtq-operator
        - cluster-network-addons-operator
        - kubevirt-operator
        - ssp-operator
        - cdi-operator
        - hostpath-provisioner-operator
        - mtq-operator
      status:
        conditions:
        - lastTransitionTime: "2023-12-18T09:41:06Z"
          message: ""
          observedGeneration: 11
          reason: Upgradeable
          status: "True"
          type: Upgradeable
      

      since its observedGeneration is out of sync, this check:
      https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/operators/olm/operatorconditions.go#L44C1-L48

      fails and the upgrade never starts.

      I suspect (I'm only guessing) that it could be a regression introduced with the memory optimization for https://issues.redhat.com/browse/OCPBUGS-17157 .

      Version-Release number of selected component (if applicable):

          OCP 4.15.0-ec.3

      How reproducible:

      - Not reproducible (with the same CNV bundles) on OCP v4.14.z.
      - Pretty high (but not 100%) on OCP 4.15.0-ec.3
      

       

       

      Steps to Reproduce:

          1. Try triggering a CNV v4.14.1 -> v4.15.0 on OCP 4.15.0-ec.3
          2.
          3.
          

      Actual results:

          The OLM is not reacting to changes on spec.conditions on the pending operator condition, so metadata.generation is constantly out of sync with status.conditions[*].observedGeneration and so the CSV is reported as 
      
          message: 'operator is not upgradeable: the operatorcondition status "Upgradeable"="True"
            is outdated'
          phase: Pending
          reason: OperatorConditionNotUpgradeable
      
      

      Expected results:

          The OLM correctly reconcile the operatorCondition and the upgrade starts

      Additional info:

          Not reproducible with exactly the same bundle (origin and target) on OCP v4.14.z

            skuznets@redhat.com Steve Kuznetsov (Inactive)
            stirabos Simone Tiraboschi
            Jian Zhang Jian Zhang
            Simone Tiraboschi
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

              Created:
              Updated:
              Resolved: