-
Bug
-
Resolution: Done-Errata
-
Major
-
None
-
False
-
-
False
-
CLOSED
-
---
-
---
-
-
-
Important
-
No
Description of problem:
If your default storage class was not supporting snapshots,
boot source images, created by the DataImportCron in openshift-virtualization-os-images namespace, will be imported as the DVs/PVCs.
When you switch the default storage class to OCS, you can re-import the images by deleting the old DVs. The DV/PVC will be re-imported, VolumeSnapshot object will be created, and DV/PVC will be removed automatically.
Alex akalenyu@redhat.com looked at it, and sees 2 issues:
Issue 1: Snapshots are being made out of the previous storage class (when changing SC from HPP->OCS)
Issue 2: When deleting the old storage class DVs, there may be a race where the snapshot got created, but the DV didn't recreate
Version-Release number of selected component (if applicable):
4.14
How reproducible:
Always
Steps to Reproduce:
1. Have a non-snapshotable default storage class (HPP)
2. See that DVs/PVCs were imported
$ oc get dv -A
NAMESPACE NAME PHASE PROGRESS RESTARTS AGE
openshift-virtualization-os-images centos-stream8-b9b768dcd73b Succeeded 100.0% 18h
openshift-virtualization-os-images centos-stream9-362e1f1d9f11 Succeeded 100.0% 18h
openshift-virtualization-os-images centos7-680e9b4e0fba Succeeded 100.0% 18h
openshift-virtualization-os-images fedora-f7cc15256f08 Succeeded 100.0% 18h
openshift-virtualization-os-images rhel8-0da894200daa Succeeded 100.0% 18h
openshift-virtualization-os-images rhel9-b006ef7856b6 Succeeded 100.0% 18h
3. Make HPP non-default, make OCS default
oc patch storageclass ocs-storagecluster-ceph-rbd -p '{"metadata": {"annotations":
{"storageclass.kubernetes.io/is-default-class": "true"}}}'
4. Delete one DV
$ oc delete dv -n openshift-virtualization-os-images rhel9-b006ef7856b6
datavolume.cdi.kubevirt.io "rhel9-b006ef7856b6" deleted
5. DV didn't get recreated (but should have been), VolumeSnapshot was created, but it's not Ready
$ oc get VolumeSnapshot -A
NAMESPACE NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
openshift-virtualization-os-images rhel9-b006ef7856b6 false rhel9-b006ef7856b6 ocs-storagecluster-rbdplugin-snapclass 13s
[cloud-user@ocp-psi-executor ~]$ oc get VolumeSnapshot -n openshift-virtualization-os-images rhel9-b006ef7856b6 -oyaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
annotations:
cdi.kubevirt.io/storage.import.lastUseTime: "2023-07-27T14:31:32.631870881Z"
creationTimestamp: "2023-07-27T14:31:32Z"
finalizers:
- snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
generation: 1
labels:
app: containerized-data-importer
app.kubernetes.io/component: storage
app.kubernetes.io/managed-by: cdi-controller
app.kubernetes.io/part-of: hyperconverged-cluster
app.kubernetes.io/version: 4.14.0
cdi.kubevirt.io: ""
cdi.kubevirt.io/dataImportCron: rhel9-image-cron
name: rhel9-b006ef7856b6
namespace: openshift-virtualization-os-images
resourceVersion: "1182048"
uid: d69181d0-4195-4b3f-91b4-ba3631f05249
spec:
source:
persistentVolumeClaimName: rhel9-b006ef7856b6
volumeSnapshotClassName: ocs-storagecluster-rbdplugin-snapclass
status:
error:
message: 'Failed to create snapshot content with error snapshot controller failed
to update rhel9-b006ef7856b6 on API server: cannot get claim from snapshot'
6. See that 2 minutes later, other VolumeSnapshots are created while old DVs were not yet deleted
$ oc get VolumeSnapshot -A
NAMESPACE NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
openshift-virtualization-os-images centos-stream8-b9b768dcd73b false centos-stream8-b9b768dcd73b ocs-storagecluster-rbdplugin-snapclass snapcontent-8455f2ea-0d70-4998-9fa5-bbc42133b1f5 23s
openshift-virtualization-os-images centos-stream9-362e1f1d9f11 false centos-stream9-362e1f1d9f11 ocs-storagecluster-rbdplugin-snapclass snapcontent-3eec6ff1-f73f-493f-b61b-58abfeec5b65 23s
openshift-virtualization-os-images centos7-680e9b4e0fba false centos7-680e9b4e0fba ocs-storagecluster-rbdplugin-snapclass snapcontent-76229453-37ff-40f6-8ce0-94e15a5b912c 23s
openshift-virtualization-os-images fedora-f7cc15256f08 false fedora-f7cc15256f08 ocs-storagecluster-rbdplugin-snapclass snapcontent-94d05d80-20f5-4861-a7af-344f19842a61 23s
openshift-virtualization-os-images rhel8-0da894200daa false rhel8-0da894200daa ocs-storagecluster-rbdplugin-snapclass snapcontent-df7f9a06-4a2e-41b1-8f04-a16758daf4e8 23s
openshift-virtualization-os-images rhel9-b006ef7856b6 false rhel9-b006ef7856b6 ocs-storagecluster-rbdplugin-snapclass 2m47s
7. See the yaml of another VolumeSnapshot, whose DV/PVC wasn't deleted and still using non-snapshotable HPP:
spec:
source:
persistentVolumeClaimName: centos-stream8-b9b768dcd73b
volumeSnapshotClassName: ocs-storagecluster-rbdplugin-snapclass
status:
boundVolumeSnapshotContentName: snapcontent-8455f2ea-0d70-4998-9fa5-bbc42133b1f5
error:
message: 'Failed to check and update snapshot content: failed to take snapshot
of the volume pvc-e59ee8cd-57d0-4ecf-906f-0ab7a1f8ba72: "rpc error: code = Internal
desc = panic runtime error: invalid memory address or nil pointer dereference"'
time: "2023-07-27T14:33:56Z"
readyToUse: false
8. To fix the broken VolumeSnapshot of the first deleted DV: delete that VolumeSnapshot
$ oc delete VolumeSnapshot -n openshift-virtualization-os-images rhel9-b006ef7856b6
volumesnapshot.snapshot.storage.k8s.io "rhel9-b006ef7856b6" deleted
9. This will trigger the DV/PVC to re-import on OCS, create a VolumeSnapshot that will be ReadyToUse, and DV/PVC will be deleted automatically.
Actual results:
Re-importing requires more steps.
Expected results:
Re-importing should happen once we switch the storage class and delete the old DVs.
- is duplicated by
-
CNV-31675 [2228606] DataSource becomes not-ready if default StorageClass modified without deleting PVCs
- Closed
- external trackers