-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.13, 4.12.z, 4.11.z, 4.10.z
-
+
-
Important
-
No
-
Rejected
-
False
-
Description of problem:
As expected, Cinder does not allow to remove a snapshot when it has volumes making use of it. From cinder-volumes.log:
2023-04-27 11:17:19.249 56 ERROR cinder.volume.manager [req-def615eb-f53c-4171-822a-61ff03f97e9c 4737331929164ac4a09bdb0a7c493f44 f4e3bb18afb640e4b7c9e5307491e740 - default default] Delete snapshot failed, due to snapshot busy.: cinder.exception.SnapshotIsBusy: deleting snapshot snapshot-760cc840-b1d0-4230-aeea-2d93329ba032 that has dependent volumes
However, when deleting the volumesnapshot object in OCP, the volumesnapshotcontent is also removed. That is leading to an inconsistency between the resources existing in OCP and OSP.
That inconsistency implies that, for example, it is not possible to delete volumes that have snapshots whose volumesnapshotcontent object is not present in OCP, so those PVs based on the orphaned snapshot are stuck and cannot be deleted from OCP/OSP.
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-04-21-084440 RHOS-16.2-RHEL-8-20221201.n.1
How reproducible: Always.
Steps to Reproduce:
Manual reproduction:
- load test-1.yaml and check that pod, pvc and pv are created.
- scale down the deployment to 0
- load test-2.yaml and check that the pod, pvc, vs and vsc are succesfully created.
- Remove the snapshot: oc delete vs/snapshot-pvc-0
- Check that the vsc does not exist but it exists in openstack:
$ oc get vs NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE snapshot-pvc-0 true pvc-0 1Gi snapshot-class-demo snapcontent-417a3943-5171-4e8f-8363-1f40ca2a54c0 2m26s 11m $ oc get vsc NAME READYTOUSE RESTORESIZE DELETIONPOLICY DRIVER VOLUMESNAPSHOTCLASS VOLUMESNAPSHOT VOLUMESNAPSHOTNAMESPACE AGE snapcontent-417a3943-5171-4e8f-8363-1f40ca2a54c0 true 1073741824 Delete cinder.csi.openstack.org snapshot-class-demo snapshot-pvc-0 test 11m $ oc delete vs/snapshot-pvc-0 volumesnapshot.snapshot.storage.k8s.io "snapshot-pvc-0" deleted $ oc get vsc No resources found $ openstack volume snapshot list +--------------------------------------+-----------------------------------------------+----------------------------------------+-----------+------+ | ID | Name | Description | Status | Size | +--------------------------------------+-----------------------------------------------+----------------------------------------+-----------+------+ | 760cc840-b1d0-4230-aeea-2d93329ba032 | snapshot-417a3943-5171-4e8f-8363-1f40ca2a54c0 | Created by OpenStack Cinder CSI driver | available | 1 | +--------------------------------------+-----------------------------------------------+----------------------------------------+-----------+------+
Attatching must-gather link in private comment.
This issue provokes these 4 testcases from cindercsi testsuite in origin to fail:
- "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Dynamic Snapshot (delete policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"
- "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Dynamic Snapshot (retain policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"
- "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Pre-provisioned Snapshot (delete policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"
- "External Storage [Driver: cinder.csi.openstack.org] [Testpattern: Pre-provisioned Snapshot (retain policy)] snapshottable[Feature:VolumeSnapshotDataSource] volume snapshot controller should check snapshot fields, check restore correctly works after modifying source data, check deletion (persistent)"
During the destroy of resources, the PVCs created from orphan snapshots are stuck in the system.
Actual results:
volumesnapshotcontent does not exist but snapshot remains in openstack. cindercsi testscase from origin repo failing.
Expected results:
volumesnapshotcontent is kept in OCP until it is successfully removed from openstack.
Additional info:
must-gather attached on private comment