Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-19211

[2098657] VM workload - PVC Filesystem write performance is 95% lower compared to Block.

XMLWordPrintable

    • Storage Core Sprint 223, Storage Core Sprint 225, Storage Core Sprint 228, Storage Core Sprint 229, Storage Core Sprint 230, Storage Core Sprint 232, Storage Core Sprint 233, Storage Core Sprint 234
    • Important
    • None

      Description of problem:
      I recently encountered a performance issue on our CI which is mind-boggling
      when comparing VM IO workload writing to a PVC - when the PVC was configured as a filesystem vs Block like so:


      kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
      annotations:
      cdi.kubevirt.io/storage.preallocation: "true"
      name: vdbench-pvc-claim
      namespace: benchmark-runner
      spec:
      storageClassName: ocs-storagecluster-ceph-rbd
      accessModes: [ "ReadWriteOnce" ]
      volumeMode: Filesystem # or Set to BLOCK
      resources:
      requests:
      storage: 64Gi

      after some investigation, we found that this is happening because on block devices we automatically set
      io=native, while on filesystem we do not, now if we use a filesystem within a DataVolume like so:


      dataVolumeTemplates:

      • apiVersion: cdi.kubevirt.io/v1
        kind: DataVolume
        metadata:
        annotations:
        kubevirt.io/provisionOnNode: worker-0
        name: workload-disk
        spec:
        pvc:
        accessModes:
      • ReadWriteOnce
        resources:
        requests:
        storage: 65Gi
        storageClassName: ocs-storagecluster-ceph-rbd
        volumeMode: Filesystem
        source:
        blank: {}

      we will still experience 95% degradation compared to block, but if we add "preallocation: true" like so:


      dataVolumeTemplates:

      • apiVersion: cdi.kubevirt.io/v1
        kind: DataVolume
        metadata:
        annotations:
        kubevirt.io/provisionOnNode: worker-0
        name: workload-disk
        spec:
        preallocation: true
        pvc:
        accessModes:
      • ReadWriteOnce
        resources:
        requests:
        storage: 65Gi
        storageClassName: ocs-storagecluster-ceph-rbd
        volumeMode: Filesystem
        source:
        blank: {}

      then turns out that using "preallocation" (which was created as a tool to improve performance on thin devices),
      that magically causes QEMU to set io=native to the filesystem (https://github.com/kubevirt/kubevirt/blob/main/pkg/virt-launcher/virtwrap/converter/converter.go#L480)
      that is a workaround that is only applicable to data volumes.

      as for PVC the scenario, that's a little more complicated the workaround for that issue will be
      to manually create a fully preallocated disk.img in the root directory of the PVC, CNV correctly detects that it was preallocated, and attached it to the VM with io=native

      however, both the above workarounds are far from being user-friendly, and there are only a few people that actually know that using the filesystem will cause such severe performance issues, and even fewer know how to address it, which is why I suggest the following:

      1. for Datavolume - preallocation should be set to true by default.
      2. for PVC - we should implement a way to set io=native

              akalenyu Alex Kalenyuk
              bbenshab@redhat.com Boaz Ben Shabat
              Kevin Alon Goldblatt Kevin Alon Goldblatt
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: