Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-33057

[CNF-IBU]: Stale backup CR not getting deleted during IBU Abort/Finalize operation

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • 4.16.0
    • LCA operator
    • None
    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          Stale backup CR (acm-klusterlet) not getting deleted during IBU Abort/Finalize operation on target sno cluster.

      Version-Release number of selected component (if applicable):

      TALM bundle: 4.16.0-23
      Seed SNO Cluster: OCP 4.16.0-0.nightly-2024-04-16-195622
      Target SNO cluster before upgrade : OCP 4.14.22
      Target Hub cluster: OCP 4.16.0-0.nightly-2024-04-16-195622
      ACM: 2.10.2
      GitOps: 1.12.0
      LCA operator bundle container version: 4.16.0-36
      LCA Operator version: 4.16.0-31
      Recert image version: 4.16.0-8

      How reproducible:

      Observed 2 times, potential frequent occurrence issue.

      Steps to Reproduce:

      [1] Make sure all policies reporting  compliant except site specific config policy (which has destination [new labels] extra manifests) on target sno cluster.
      [2] Apply IBU Prep CGU via TALM from target hub cluster.
      [3] Make sure IBU CR reporting Prep stage completed successfully.
      [4] Import stale backucp CR (acm-klusterlet) with different cluster id label other than actual target sno cluster id 
      [5] Apply IBU finalize policy to set back IBU CR to Idle stage from target hub cluster. 
      [6] Observed stale backup CR (acm-klusterlet) not removed from openshift-adp name space on target sno cluster but LCA operator printing log message "All Backup CRs have been deleted successfully"
      {{
      2024-04-26T19:06:59Z    INFO    controllers.ImageBasedUpgrade    Cleaning up DeleteBackupRequest and Backup CRs
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade.BackupRestore    All DeleteBackupRequest CRs have been deleted
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade.BackupRestore    All Backup CRs have been deleted successfully
      }}

      Actual results:

          [kni@registry.kni-qe-53 ~]$ oc get ibu
      NAME      AGE   DESIRED STAGE   STATE   DETAILS
      upgrade   64m   Idle            Idle    Idle
      [kni@registry.kni-qe-53 ~]$ oc -n openshift-adp get backups
      NAME             AGE
      acm-klusterlet   21m
      [kni@registry.kni-qe-53 ~]$ 
      

      Expected results:

      No backup CR exists in openshift-adp namespace on target sno cluster after setting back IBU CR to Idle stage    

      Additional info:

      //SUT logs during the test //
      kni@registry.kni-qe-53 ~]$ oc get ibu
      NAME      AGE   DESIRED STAGE   STATE       DETAILS
      upgrade   44m   Prep            Completed   Prep completed successfully
      [kni@registry.kni-qe-53 ~]$ oc -n openshift-adp get backups
      NAME             AGE
      acm-klusterlet   104s
      [kni@registry.kni-qe-53 ~]$ oc -n openshift-adp get backups -o yaml | grep -i clusterid
            config.openshift.io/clusterID: 6b0b914f-bb0a-4f27-914c-e3dce9a67e91
      [kni@registry.kni-qe-53 ~]$ 
      [kni@registry.kni-qe-53 ~]$ oc get no
      NAME                              STATUS   ROLES                         AGE     VERSION
      helix57.lab.eng.rdu2.redhat.com   Ready    control-plane,master,worker   4h48m   v1.27.12+7bee54d
      [kni@registry.kni-qe-53 ~]$ oc get clusterversions.config.openshift.io -o yaml | grep -i clusterid
          clusterID: c5adae22-4627-494f-9e3e-b78b6a1c6b20
      [kni@registry.kni-qe-53 ~]$ oc get no
      NAME                              STATUS   ROLES                         AGE     VERSION
      helix57.lab.eng.rdu2.redhat.com   Ready    control-plane,master,worker   4h49m   v1.27.12+7bee54d
      [kni@registry.kni-qe-53 ~]$ oc -n openshift-adp get backups
      NAME             AGE
      acm-klusterlet   3m17s
      [kni@registry.kni-qe-53 ~]$ oc -n openshift-adp get backups -o yaml
      apiVersion: v1
      items:
      - apiVersion: velero.io/v1
        kind: Backup
        metadata:
          annotations:
            lca.openshift.io/apply-label: rbac.authorization.k8s.io/v1/clusterroles/klusterlet,apps/v1/deployments/open-cluster-management-agent/klusterlet,v1/secrets/open-cluster-management-agent/bootstrap-hub-kubeconfig,rbac.authorization.k8s.io/v1/clusterroles/klusterlet,v1/serviceaccounts/open-cluster-management-agent/klusterlet,rbac.authorization.k8s.io/v1/clusterroles/open-cluster-management:klusterlet-admin-aggregate-clusterrole,rbac.authorization.k8s.io/v1/clusterrolebindings/klusterlet,operator.open-cluster-management.io/v1/klusterlets/klusterlet,apiextensions.k8s.io/v1/customresourcedefinitions/klusterlets.operator.open-cluster-management.io,v1/secrets/open-cluster-management-agent/open-cluster-management-image-pull-credentials
            velero.io/resource-timeout: 10m0s
            velero.io/source-cluster-k8s-gitversion: v1.27.11+ec42b99
            velero.io/source-cluster-k8s-major-version: "1"
            velero.io/source-cluster-k8s-minor-version: "27"
          creationTimestamp: "2024-04-26T19:04:06Z"
          generation: 1
          labels:
            config.openshift.io/clusterID: 6b0b914f-bb0a-4f27-914c-e3dce9a67e91
            velero.io/storage-location: dataprotectionapplication-1
          name: acm-klusterlet
          namespace: openshift-adp
          resourceVersion: "67188"
          uid: 47c4c5d2-30f1-45b4-94b9-58ec98bba2cb
        spec:
          csiSnapshotTimeout: 10m0s
          defaultVolumesToFsBackup: false
          hooks: {}
          includedClusterScopedResources:
          - klusterlets.operator.open-cluster-management.io
          - klusterlet
          - clusterrole
          - clusterrolebinding
          includedNamespaceScopedResources:
          - deployments
          - serviceaccounts
          - secrets
          includedNamespaces:
          - open-cluster-management-agent
          itemOperationTimeout: 4h0m0s
          labelSelector:
            matchLabels:
              lca.openshift.io/backup: acm-klusterlet
          metadata: {}
          snapshotMoveData: false
          storageLocation: dataprotectionapplication-1
          ttl: 720h0m0s
        status:
          completionTimestamp: "2024-04-21T20:49:41Z"
          expiration: "2024-05-21T20:49:40Z"
          formatVersion: 1.1.0
          phase: Completed
          progress:
            itemsBackedUp: 10
            totalItems: 10
          startTimestamp: "2024-04-21T20:49:41Z"
          version: 1
      kind: List
      metadata:
        resourceVersion: ""
      [kni@registry.kni-qe-53 ~]$  [kni@registry.kni-qe-53 ~]$ oc get ibu
      NAME      AGE   DESIRED STAGE   STATE   DETAILS
      upgrade   64m   Idle            Idle    Idle
      [kni@registry.kni-qe-53 ~]$ oc -n openshift-adp get backups
      NAME             AGE
      acm-klusterlet   21m
      [kni@registry.kni-qe-53 ~]$ 
      
      ==============Complete LCA Operator logs during Abort/Finalize Operation =========================
      2024-04-26T19:06:56Z    INFO    controllers.ImageBasedUpgrade    Start reconciling IBU    {"name": {"name":"upgrade"}}
      2024-04-26T19:06:56Z    INFO    controllers.ImageBasedUpgrade    Loaded IBU    {"name": {"name":"upgrade"}, "version": "67679", "desired stage": "Idle"}
      INFO[2737] Executing /usr/bin/env with args [-- rpm-ostree status --json] 
      2024-04-26T19:06:56Z    INFO    controllers.ImageBasedUpgrade    Starting handleAbort
      2024-04-26T19:06:56Z    INFO    controllers.ImageBasedUpgrade    Terminating precaching worker thread, will wait up to 30 seconds
      2024-04-26T19:06:56Z    INFO    controllers.ImageBasedUpgrade    Cleaning up stateroot
      INFO[2737] Executing /usr/bin/env with args [-- rpm-ostree status --json] 
      INFO[2738] Executing /usr/bin/env with args [-- rpm-ostree status --json] 
      INFO[2738] Executing /usr/bin/env with args [-- ostree admin undeploy 1] 
      INFO[2741] Executing /usr/bin/env with args [-- bash -c unshare -m /bin/sh -c "mount -o remount,rw /sysroot && rm -rf /ostree/deploy/rhcos_4.16.0_0.nightly_2024_04_16_195622"] 
      2024-04-26T19:06:59Z    INFO    controllers.ImageBasedUpgrade    Cleaning up precache
      2024-04-26T19:06:59Z    INFO    controllers.ImageBasedUpgrade    Cleaning up DeleteBackupRequest and Backup CRs
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade.BackupRestore    All DeleteBackupRequest CRs have been deleted
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade.BackupRestore    All Backup CRs have been deleted successfully
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade    Cleaning up IBU files
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade    Finished handleAbort successfully
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade    Finish reconciling IBU    {"name": {"name":"upgrade"}, "requeueRightAway": false}
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade    Start reconciling IBU    {"name": {"name":"upgrade"}}
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade    Loaded IBU    {"name": {"name":"upgrade"}, "version": "67714", "desired stage": "Idle"}
      INFO[2741] Executing /usr/bin/env with args [-- rpm-ostree status --json] 
      2024-04-26T19:07:00Z    INFO    controllers.ImageBasedUpgrade    Finish reconciling IBU    {"name": {"name":"upgrade"}, "requeueRightAway": false}
       

       

            jche@redhat.com Jun Chen
            rh-ee-pmohanra Periyamaruthu Mohanraj
            Periyamaruthu Mohanraj Periyamaruthu Mohanraj
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: