Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-77829

[Boxcutter] ClusterExtension stuck on failed version upgrade, cannot upgrade to different version or rollback even after deleting ClusterExtensionRevisions

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 4.22
    • OLM
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      After a failed upgrade attempt, ClusterExtension becomes stuck attempting to install the failed version and cannot be redirected to a different version or rolled back, even after manually deleting ClusterExtensionRevisions.    

      Version-Release number of selected component (if applicable):

          xzha@xzha1-mac olmv1test % oc get clusterversion
      NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.22.0-0.nightly-2026-03-02-153725   True        False         42m     Cluster version is 4.22.0-0.nightly-2026-03-02-153725

      How reproducible:

      always    

      Steps to Reproduce:

          1. create catalog/sa
      xzha@xzha1-mac ocp-88138 % cat catalog.yaml 
      apiVersion: olm.operatorframework.io/v1
      kind: ClusterCatalog
      metadata:
        name: catalog-88138
        labels:
          example.com/support: "true"
          test-cases: ocp-88138
      spec:
        priority: 1000
        source:
          type: Image
          image:
            ref: quay.io/olmqe/nginxolm-operator-index:nginxolm88138
      xzha@xzha1-mac ocp-88138 % oc apply -f catalog.yaml 
      clustercatalog.olm.operatorframework.io/catalog-88138 created
      xzha@xzha1-mac ocp-88138 % opm alpha list bundles quay.io/olmqe/nginxolm-operator-index:nginxolm88138
      PACKAGE     CHANNEL         BUNDLE             REPLACES           SKIPS              SKIP RANGE  IMAGE
      nginx88138  candidate-v1.0  nginx88138.v1.0.1                                                    quay.io/olmqe/nginxolm-operator-bundle:v1.0.1-nginxolm88138
      nginx88138  candidate-v1.0  nginx88138.v1.0.2  nginx88138.v1.0.1                                 quay.io/olmqe/nginxolm-operator-bundle:v1.0.2-nginx88138
      nginx88138  candidate-v1.0  nginx88138.v1.0.3  nginx88138.v1.0.2  nginx88138.v1.0.1              quay.io/olmqe/nginxolm-operator-bundle:v1.0.3-nginxolm88138
      
      xzha@xzha1-mac ocp-88138 % cat sa.yaml 
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRole
      metadata:
        name: "test-88138-installer-admin-clusterrole"
      rules:
        - apiGroups:
          - "*"
          resources:
          - "*"
          verbs:
          - "*"
      ---
      apiVersion: v1
      kind: ServiceAccount
      metadata:
        name: "test-88138"
        namespace: "ns-88138"
      ---
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRoleBinding
      metadata:
        name: "test-88138-installer-admin-clusterrole-binding"
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: "test-88138-installer-admin-clusterrole"
      subjects:
        - kind: ServiceAccount
          name: "test-88138"
          namespace: "ns-88138"
      
          2. create clusterextension
      
      xzha@xzha1-mac ocp-88138 % cat extension.yaml 
      apiVersion: olm.operatorframework.io/v1
      kind: ClusterExtension
      metadata:
        name: "extension-88138"
      spec:
        namespace: "ns-88138"
        serviceAccount:
          name: "test-88138"
        source:
          sourceType: "Catalog"
          catalog:
            packageName: "nginx88138"
            channel: candidate-v1.0
            version: "1.0.1"
            upgradeConstraintPolicy: "CatalogProvided"
      
      xzha@xzha1-mac ocp-88138 % oc get clusterextension extension-88138
      NAME              INSTALLED BUNDLE    VERSION   INSTALLED   PROGRESSING   AGE
      extension-88138   nginx88138.v1.0.1   1.0.1     True        True          49s
      
      xzha@xzha1-mac ocp-88138 % oc get clusterextensionrevision
      NAME                AVAILABLE   PROGRESSING   AGE
      extension-88138-1   True        True          57s
      
      3. upgrade to 1.0.2
      
      xzha@xzha1-mac ocp-88138 % oc patch ClusterExtension extension-88138 -p '{"spec":{"source":{"catalog":{"version":"1.0.2"}}}}' --type=merge
      clusterextension.olm.operatorframework.io/extension-88138 patched
      
      upgrade failed
      
      status:
        activeRevisions:
        - name: extension-88138-1
        - conditions:
          - lastTransitionTime: "2026-03-05T03:02:48Z"
            message: 'Object Deployment.apps/v1 ns-88138/nginx88138-controller-manager:
              ".status.updatedReplicas" != ".status.replicas" expected: 1 got: 2'
            observedGeneration: 2
            reason: ProbeFailure
            status: "False"
            type: Available
          - lastTransitionTime: "2026-03-05T03:02:48Z"
            message: Revision 1.0.2 is rolling out.
            observedGeneration: 2
            reason: RollingOut
            status: "True"
            type: Progressing
          name: extension-88138-2
      
      xzha@xzha1-mac ocp-88138 % oc get clusterextensionrevision                                                                                  NAME                AVAILABLE   PROGRESSING   AGE
      extension-88138-1   False       True          115s
      extension-88138-2   False       True          13s
      
      4, try to upgrade to 1.0.3
      
      xzha@xzha1-mac ocp-88138 % oc patch ClusterExtension extension-88138 -p '{"spec":{"source":{"catalog":{"version":"1.0.3"}}}}' --type=merge
      clusterextension.olm.operatorframework.io/extension-88138 patched
      
      still try to install 1.0.2, even delete clusterextensionrevision extension-88138-2
      
      
      5, try to backout to 1.0.1
      xzha@xzha1-mac ocp-88138 % oc patch ClusterExtension extension-88138 -p '{"spec":{"source":{"catalog":{"version":"1.0.1"}}}}' --type=merge
      clusterextension.olm.operatorframework.io/extension-88138 patched
      
      still try to install 1.0.2, even delete all clusterextensionrevisions 
      
      xzha@xzha1-mac ocp-88138 % oc delete clusterextensionrevision extension-88138-1 
      xzha@xzha1-mac ocp-88138 % oc delete clusterextensionrevision extension-88138-2
      
      
      xzha@xzha1-mac ocp-88138 % oc get clusterextensionrevision                     
      NAME                AVAILABLE   PROGRESSING   AGE
      extension-88138-1   False       True          6s
      xzha@xzha1-mac ocp-88138 % oc get clusterextensionrevision  extension-88138-1 -o yaml
      apiVersion: olm.operatorframework.io/v1
      kind: ClusterExtensionRevision
      metadata:
        annotations:
          olm.operatorframework.io/bundle-name: nginx88138.v1.0.2
          olm.operatorframework.io/bundle-reference: quay.io/olmqe/nginxolm-operator-bundle:v1.0.2-nginx88138
          olm.operatorframework.io/bundle-version: 1.0.2
          olm.operatorframework.io/package-name: nginx88138
          olm.operatorframework.io/service-account-name: test-88138
          olm.operatorframework.io/service-account-namespace: ns-88138
        creationTimestamp: "2026-03-05T03:05:23Z"
        finalizers:
        - olm.operatorframework.io/teardown
        generation: 1
      
      
           

      Actual results:

          ClusterExtension stuck 

      Expected results:

          ClusterExtension can upgrade to another version or backout.

      Additional info:

          

              rh-ee-cchantse Catherine Chan-Tse
              rhn-support-xzha Xia Zhao
              Xia Zhao Xia Zhao
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: