-
Bug
-
Resolution: Done-Errata
-
Major
-
CNV v4.14.3
-
None
-
13
-
False
-
-
False
-
CNV v4.17.0.rhel9-498
-
---
-
---
-
-
Storage Core Sprint 254, Storage Core Sprint 256, Storage Core Sprint 257, Storage Core Sprint 258, Storage Core Sprint 259, CNV Storage 260, Storage Core Sprint 262, Storage Core Sprint 263, CNV Storage 264
-
No
The host-assisted volume cloning doesn't seem to handle sparse images correctly. The resulting image is filled with zeros which wastes storage space. The following are the steps to reproduce the issue:
Create a volume using ocs-storagecluster-ceph-rbd-virtualization storageclass in Filesystem mode:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: myvol namespace: kubevirt-example spec: accessModes: - ReadWriteOnce resources: requests: storage: 50Gi storageClassName: ocs-storagecluster-ceph-rbd-virtualization volumeMode: Filesystem
Mount the volume to a pod and create a sparse disk.img file on the volume with 1GiB of data and 5 GiB of virtual size:
$ dd if=/dev/random of=disk.img bs=1024M count=1 $ truncate -s 5G disk.img
Check that the file is a sparse file:
$ ls -ls --block-size=M disk.img 1025M -rw-r--r--. 1 root root 5120M Mar 9 13:59 disk.img
Check the PVC disk usage in Ceph:
$ rbd du ocs-storagecluster-cephblockpool/csi-vol-bb781ad4-6ce4-4a66-8645-878c7d56f2c4 NAME PROVISIONED USED csi-vol-bb781ad4-6ce4-4a66-8645-878c7d56f2c4 50 GiB 1.2 GiB
Next, clone the volume to a Ceph RBD volume in Block mode by applying this DataVolume resource:
apiVersion: cdi.kubevirt.io/v1beta1 kind: DataVolume metadata: name: myvol-converted spec: source: pvc: name: myvol namespace: kubevirt-example storage: accessModes: - ReadWriteMany storageClassName: ocs-storagecluster-ceph-rbd-virtualization volumeMode: Block
Follow the logs of the pods that perform the volume conversion:
$ stern . + cdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 › cdi-upload-servercdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 cdi-upload-server I0309 14:03:15.326876 1 uploadserver.go:74] Running server on 0.0.0.0:8443- toolbox-container-6f56f456dd-fntvb › toolbox-containercdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 cdi-upload-server I0309 14:03:56.552251 1 uploadserver.go:389] Content type header is "filesystem-clone"cdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 cdi-upload-server I0309 14:03:56.552488 1 uploadserver.go:493] Untaring 5368709120 bytes to /dev/cdi-block-volume+ 9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod › cdi-clone-source9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source VOLUME_MODE=filesystem9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source MOUNT_POINT=/var/run/cdi/clone/source9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source /var/run/cdi/clone/source /9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source UPLOAD_BYTES=53687296009644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:56.238237 10 clone-source.go:220] content-type is "filesystem-clone"9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:56.238291 10 clone-source.go:221] mount is "/var/run/cdi/clone/source"9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:56.238295 10 clone-source.go:222] upload-bytes is 53687296009644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:56.238310 10 clone-source.go:239] Starting cloner target9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:56.238359 10 clone-source.go:177] Executing [/usr/bin/tar cv -S disk.img]9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:56.539802 10 clone-source.go:251] Set header to filesystem-clone9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:57.238841 10 prometheus.go:75] 1.149644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:58.238876 10 prometheus.go:75] 2.659644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:03:59.239334 10 prometheus.go:75] 4.289644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:00.239646 10 prometheus.go:75] 4.469644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:01.240023 10 prometheus.go:75] 4.509644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:02.241024 10 prometheus.go:75] 5.039644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:03.241686 10 prometheus.go:75] 6.739644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:04.242148 10 prometheus.go:75] 8.029644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:05.243077 10 prometheus.go:75] 9.139644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:06.243825 10 prometheus.go:75] 10.599644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:07.243902 10 prometheus.go:75] 11.649644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:08.244745 10 prometheus.go:75] 12.299644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:09.244822 10 prometheus.go:75] 13.719644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:10.244881 10 prometheus.go:75] 15.059644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:11.245753 10 prometheus.go:75] 16.169644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:12.246252 10 prometheus.go:75] 17.719644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:13.247379 10 prometheus.go:75] 18.819644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:14.247841 10 prometheus.go:75] 19.929644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:14.292635 10 clone-source.go:127] Wrote 1073745920 bytes9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:15.248266 10 prometheus.go:75] 100.00cdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 cdi-upload-server I0309 14:04:46.778822 1 uploadserver.go:502] Written 5368709120cdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 cdi-upload-server I0309 14:04:46.778866 1 uploadserver.go:416] Wrote data to /dev/cdi-block-volume9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:46.779122 10 clone-source.go:269] Response body:cdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 cdi-upload-server I0309 14:04:46.778949 1 uploadserver.go:203] Shutting down http server after successful upload9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source I0309 14:04:46.779151 10 clone-source.go:271] clone complete9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod cdi-clone-source /cdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 cdi-upload-server I0309 14:04:46.779311 1 uploadserver.go:103] UploadServer successfully exited- 9644db8b-9b66-40ce-b245-f514cd6027e0-source-pod › cdi-clone-source- cdi-upload-tmp-pvc-fbe662e2-f95a-4149-9549-4f7abed41072 › cdi-upload-server
The resulting clone PVC volume definition:
apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: cdi.kubevirt.io/cloneFallbackReason: The volume modes of source and target are incompatible cdi.kubevirt.io/clonePhase: Succeeded cdi.kubevirt.io/cloneType: copy cdi.kubevirt.io/storage.condition.running: "false" cdi.kubevirt.io/storage.condition.running.message: Clone Complete cdi.kubevirt.io/storage.condition.running.reason: Completed cdi.kubevirt.io/storage.contentType: kubevirt cdi.kubevirt.io/storage.pod.restarts: "0" cdi.kubevirt.io/storage.populator.progress: 100.0% cdi.kubevirt.io/storage.preallocation.requested: "false" cdi.kubevirt.io/storage.usePopulator: "true" pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com volume.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com creationTimestamp: "2024-03-09T13:27:05Z" finalizers: - kubernetes.io/pvc-protection labels: app: containerized-data-importer app.kubernetes.io/component: storage app.kubernetes.io/managed-by: cdi-controller app.kubernetes.io/part-of: hyperconverged-cluster app.kubernetes.io/version: 4.14.3 name: myvol-converted namespace: kubevirt-example ownerReferences: - apiVersion: cdi.kubevirt.io/v1beta1 blockOwnerDeletion: true controller: true kind: DataVolume name: myvol-converted uid: db156582-fee8-476b-9ac7-92e367ebc925 resourceVersion: "1011506" uid: 2375704c-b13f-4da0-b49d-91cac6a6d247 spec: accessModes: - ReadWriteMany dataSource: apiGroup: cdi.kubevirt.io kind: VolumeCloneSource name: volume-clone-source-db156582-fee8-476b-9ac7-92e367ebc925 dataSourceRef: apiGroup: cdi.kubevirt.io kind: VolumeCloneSource name: volume-clone-source-db156582-fee8-476b-9ac7-92e367ebc925 resources: requests: storage: "53687091200" storageClassName: ocs-storagecluster-ceph-rbd-virtualization volumeMode: Block volumeName: pvc-0c4dc9cc-7629-45f5-8e70-47a4875147b3 status: accessModes: - ReadWriteMany capacity: storage: 50Gi phase: Bound
Check the clone PVC disk usage in Ceph:
$ rbd du ocs-storagecluster-cephblockpool/csi-vol-56ac4e33-6fcf-4aa9-93c2-2c1337f4c86cNAME PROVISIONED USEDcsi-vol-56ac4e33-6fcf-4aa9-93c2-2c1337f4c86c 50 GiB 5 GiB
Note that the above result demonstrates two issues:
- The virtual size of the original volume was 5 GiB and so the expected provisioned size of the resulting block PVC is 5 GiB. Instead a 50 GiB PVC was provisioned. While running the test I didn't notice that virtual size detection was performed during the cloning that would have detected 5 GiB.
- The original volume used 1.2 GiB of disk space. It is expected that the resulting volume should also use about 1.2 GiB of disk space. Instead, the sparse volume was incorrectly expanded to the full virtual size of 5 GiB.
The clone-source pod uses "/usr/bin/tar cv -S disk.img" command to stream the sparse image. The problem is likely on the upload server side. The upload server uses io.Copy() which doesn't handle sparse files correctly as per the Stackverflow entry Sparse files are huge with io.Copy(). The sparse images are likely expanded by the io.Copy() function.
To turn the expanded image back to a sparse one, one can use the rbd sparsify command:
$ rbd sparsify ocs-storagecluster-cephblockpool/csi-vol-56ac4e33-6fcf-4aa9-93c2-2c1337f4c86c
After sparsifying the volume, the volume disk usage of 5 GiB went down to 1 GiB. The resulting 1 GiB is smaller than the original 1.2 GiB probably due to the original volume including a file system overhead:
$ rbd du ocs-storagecluster-cephblockpool/csi-vol-56ac4e33-6fcf-4aa9-93c2-2c1337f4c86c NAME PROVISIONED USED csi-vol-56ac4e33-6fcf-4aa9-93c2-2c1337f4c86c 50 GiB 1 GiB