Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12849

volumesnapshotcontents wrongly removed when cinder fails to remove the snapshot

XMLWordPrintable

    • +
    • Important
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      As expected, Cinder does not allow to remove a snapshot when it has volumes making use of it. From cinder-volumes.log:

      2023-04-27 11:17:19.249 56 ERROR cinder.volume.manager [req-def615eb-f53c-4171-822a-61ff03f97e9c 4737331929164ac4a09bdb0a7c493f44 f4e3bb18afb640e4b7c9e5307491e740 - default default] Delete snapshot failed, due to snapshot busy.: cinder.exception.SnapshotIsBusy: deleting snapshot snapshot-760cc840-b1d0-4230-aeea-2d93329ba032 that has dependent volumes

      However, when deleting the volumesnapshot object in OCP, the volumesnapshotcontent is also removed. That is leading to an inconsistency between the resources existing in OCP and OSP.

      That inconsistency implies that, for example, it is not possible to delete volumes that have snapshots whose volumesnapshotcontent object is not present in OCP, so those PVs based on the orphaned snapshot are stuck and cannot be deleted from OCP/OSP.

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2023-04-21-084440
      RHOS-16.2-RHEL-8-20221201.n.1

      How reproducible: Always.

      Steps to Reproduce:

      Manual reproduction:

      1. load test-1.yaml and check that pod, pvc and pv are created.
      2. scale down the deployment to 0
      3. load test-2.yaml and check that the pod, pvc, vs and vsc are succesfully created.
      4. Remove the snapshot: oc delete vs/snapshot-pvc-0
      5. Check that the vsc does not exist but it exists in openstack:
      $ oc get vs
      NAME             READYTOUSE   SOURCEPVC   SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS         SNAPSHOTCONTENT                                    CREATIONTIME   AGE                  
      snapshot-pvc-0   true         pvc-0                               1Gi           snapshot-class-demo   snapcontent-417a3943-5171-4e8f-8363-1f40ca2a54c0   2m26s          11m                  
      $ oc get vsc
      NAME                                               READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                     VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT   VOLUMESNAPSHOTNAMESPACE   AGE
      snapcontent-417a3943-5171-4e8f-8363-1f40ca2a54c0   true         1073741824    Delete           cinder.csi.openstack.org   snapshot-class-demo   snapshot-pvc-0   test                      11m
      $ oc delete vs/snapshot-pvc-0
      volumesnapshot.snapshot.storage.k8s.io "snapshot-pvc-0" deleted
      $ oc get vsc
      No resources found
      $ openstack volume snapshot list
      +--------------------------------------+-----------------------------------------------+----------------------------------------+-----------+------+                                         
      | ID                                   | Name                                          | Description                            | Status    | Size |                                         
      +--------------------------------------+-----------------------------------------------+----------------------------------------+-----------+------+                                         
      | 760cc840-b1d0-4230-aeea-2d93329ba032 | snapshot-417a3943-5171-4e8f-8363-1f40ca2a54c0 | Created by OpenStack Cinder CSI driver | available |    1 |                                         
      +--------------------------------------+-----------------------------------------------+----------------------------------------+-----------+------+                      

      Attatching must-gather link in private comment.

      This issue provokes these 4 testcases from cindercsi testsuite in origin to fail:

      • "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Dynamic Snapshot (delete policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"
      • "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Dynamic Snapshot (retain policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"
      • "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Pre-provisioned Snapshot (delete policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"
      • "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Pre-provisioned Snapshot (retain policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"

      During the destroy of resources, the PVCs created from orphan snapshots are stuck in the system.

       

      Actual results:

      volumesnapshotcontent does not exist but snapshot remains in openstack.
      cindercsi testscase from origin repo failing.

      Expected results:

      volumesnapshotcontent is kept in OCP until it is successfully removed from openstack.

      Additional info:

      must-gather attached on private comment

            emacchi@redhat.com Emilien Macchi
            rlobillo Ramón Lobillo
            Itshak Brown Itshak Brown
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: