Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-82906

--migrate-disks-detect-zeroes doesn't take effect for disk migration

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • rhel-9.6
    • qemu-kvm
    • None
    • Yes
    • Critical
    • 1
    • rhel-sst-virtualization-storage
    • ssg_virtualization
    • 5
    • False
    • Hide

      None

      Show
      None
    • None
    • virt-storage Sprint 4
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • x86_64
    • None

      What were you trying to do that didn't work?
      --migrate-disks-detect-zeroes doesn't take effect for disk migration

      Please provide the package NVR for which the bug is seen:
      libvirt-10.10.0-7.el9.x86_64
      qemu-kvm-9.1.0-15.el9.x86_64
      kernel-5.14.0-569.el9.x86_64

      How reproducible is this bug?:
      100%

      Steps to reproduce:

      1. Set up migration env and start vm on src host with a disk.
      
      # qemu-img create -f raw /var/lib/libvirt/images/raw.img 1G
      Formatting '/var/lib/libvirt/images/raw.img', fmt=raw size=1073741824
      
      # virsh edit vm1
          <disk type='file' device='disk'>
            <driver name='qemu' type='raw' />
            <source file='/var/lib/libvirt/images/raw.img'>
            </source>
            <target dev='sda' bus='scsi'/>
          </disk>
      
      # virsh start vm1
      
      2. Mount sda and create a file in guest.
      # mkfs.xfs /dev/sda
      # dd if=/dev/random of=/dev/sda bs=1048576 count=100   
      100+0 records in
      100+0 records out
      104857600 bytes (105 MB, 100 MiB) copied, 0.471111 s, 223 MB/s
      
      3. Check the current disk usage on src host.
      # qemu-img info /var/lib/libvirt/images/raw.img -U
      image: /var/lib/libvirt/images/raw.img
      file format: raw
      virtual size: 1 GiB (1073741824 bytes)
      disk size: 168 MiB
      ...
      
      4. Create a disk on the target host.
      # qemu-img create -f raw /var/lib/libvirt/images/raw.img 1G
      # qemu-img info /var/lib/libvirt/images/raw.img -U
      image: /var/lib/libvirt/images/raw.img
      file format: raw
      virtual size: 1 GiB (1073741824 bytes)
      disk size: 1 MiB
      ...
      
      5. Create migratable xml.
      # virsh dumpxml vm1 --migratable > mig.xml
      # vim mig.xml
      ...
          <disk type='file' device='disk'>
            <driver name='qemu' type='raw' />
            <source file='/var/lib/libvirt/images/raw.img'>
            </source>
            <target dev='sda' bus='scsi'/>
          </disk>
      ...
      
      
      6. Migrate vm with --migrate-disks-detect-zeroes option.
      # virsh migrate vm1 qemu+tcp://targethost/system --verbose --live --p2p --copy-storage-all --xml mig.xml --migrate-disks-detect-zeroes sda
      Migration: [100.00 %]
      
      7. Check disk usage on target host.
      # qemu-img info /var/lib/libvirt/images/raw.img -U
      image: /var/lib/libvirt/images/raw.img
      file format: raw
      virtual size: 1 GiB (1073741824 bytes)
      disk size: 1 GiB
      ...
      
      
      

      Expected results:
      Can retain sparsity with option --migrate-disks-detect-zeroes.

      Actual results:
      --migrate-disks-detect-zeroes doesn't take effect.

      Additional info:

      Tested the following combinations:

          disk format on src host         disk format on target host    test result
               rbd/raw/qcow2          +             rbd                     FAIL
               rbd/raw/qcow2          +             raw                     FAIL
               rbd/raw/qcow2          +             qcow2                   PASS
      
      

        1. virtqemud.log-source-9.5
          2.26 MB
        2. virtqemud.log-source-9.6
          3.74 MB
        3. virtqemud.log-target-9.5
          3.11 MB
        4. virtqemud.log-target-9.6
          2.39 MB

            [RHEL-82906] --migrate-disks-detect-zeroes doesn't take effect for disk migration

            Aihua Liang added a comment -

            Wrong branch tested for the repo, so correct it and test again, the latest result as bellow:

            Image created by Used as discard unmap virtual size disk size
            qemu-img target/nbd image / / 5G 1.96G
            qemu-img target/nbd image unmap / 5G 1.96G
            qemu-img target/nbd image unmap unmap 5G 1.97G
            qemu-img  target/raw image unmap unmap 5G 15MB
            qemu-img+full allocation target/nbd image / / 5G 5G
            qemu-img+full allocation target/nbd image unmap / 5G 5G
            qemu-img+full allocation target/nbd image unmap unmap 5G 5G
            qemu-img+full allocation target/raw image / / 5G 5G
            qemu-img+full allocation target/raw image unmap / 5G 1.97G
            qemu-img+full allocation target/raw image unmap unmap 5G 15MB

            Two issues still exist:

            1. detect-zeroes doesn't take effect for nbd image
            2. discard=unmap doesn't take effect for full-allocation nbd image.

            Test version: qemu-img version 9.2.94 (v2.4.0-79866-g9387722a2f)

            Aihua Liang added a comment - Wrong branch tested for the repo, so correct it and test again, the latest result as bellow: Image created by Used as discard unmap virtual size disk size qemu-img target/nbd image / / 5G 1.96G qemu-img target/nbd image unmap / 5G 1.96G qemu-img target/nbd image unmap unmap 5G 1.97G qemu-img  target/raw image unmap unmap 5G 15MB qemu-img+full allocation target/nbd image / / 5G 5G qemu-img+full allocation target/nbd image unmap / 5G 5G qemu-img+full allocation target/nbd image unmap unmap 5G 5G qemu-img+full allocation target/raw image / / 5G 5G qemu-img+full allocation target/raw image unmap / 5G 1.97G qemu-img+full allocation target/raw image unmap unmap 5G 15MB Two issues still exist: detect-zeroes doesn't take effect for nbd image discard=unmap doesn't take effect for full-allocation nbd image. Test version: qemu-img version 9.2.94 (v2.4.0-79866-g9387722a2f)

            Aihua Liang added a comment -

            Test with Eric's repo:, most of them meet the expectation, except the pre-allocation=full ones, details see:  RHEL-88005

            Image created by Used as discard detect-zeroes virtual size disk size
            qemu-img target/nbd image / / 5G 1.96G
            qemu-img target/nbd image unmap / 5G 1.96G
            qemu-img  target/nbd image unmap unmap 5G 15MB
            qemu-img+full allocation target/nbd image / / 5G 1.96G
            qemu-img+full allocation target/nbd image unmap / 5G 1.96G
            qemu-img+full allocation target/nbd image unmap unmap 5G 13.6MB
            qemu-img+full allocation target/local raw image / / 5G 1.96GB
            qemu-img src/local raw image unmap unmap 5G 157MB

            Aihua Liang added a comment - Test with Eric's repo:, most of them meet the expectation, except the pre-allocation=full ones, details see:  RHEL-88005 Image created by Used as discard detect-zeroes virtual size disk size qemu-img target/nbd image / / 5G 1.96G qemu-img target/nbd image unmap / 5G 1.96G qemu-img  target/nbd image unmap unmap 5G 15MB qemu-img+full allocation target/nbd image / / 5G 1.96G qemu-img+full allocation target/nbd image unmap / 5G 1.96G qemu-img+full allocation target/nbd image unmap unmap 5G 13.6MB qemu-img+full allocation target/local raw image / / 5G 1.96GB qemu-img src/local raw image unmap unmap 5G 157MB

            Eric Blake added a comment -

            upstream v2 posted, this time tested with NBD: https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg02940.html

            Eric Blake added a comment - upstream v2 posted, this time tested with NBD: https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg02940.html

            Aihua Liang added a comment -

            Still hit this issue with upstream patch: 20250411010732.358817-8-eblake@redhat.com. Details, see RHEL-87340.

            Aihua Liang added a comment - Still hit this issue with upstream patch: 20250411010732.358817-8-eblake@redhat.com. Details, see RHEL-87340.

            Eric Blake added a comment -

            The two lines that jumped out to me in Peter's review are:
            -blockdev '

            {"driver":"file","filename":"/var/lib/libvirt/images/raw.img","node-name":"libvirt-1-storage","read-only":false}

            '-device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"device_id":"drive-scsi0-0-0-0","drive":"libvirt-1-storage","id":"scsi0-0-0-0"}'
            where there is no "discard":"unmap" on the destination (therefore, the destination inherits the default "discard":"ignore" which forces write zeroes to allocate rather than punching holes), and:
            {"execute":"blockdev-mirror","arguments":

            {"job-id":"drive-scsi0-0-0-0","device":"libvirt-1-storage","target":"migration-sda-storage","speed":9223372036853727232,"sync":"full","auto-finalize":true,"auto-dismiss":false}

            ,"id":"libvirt-473"}
            For the latter, I have an upstream patch series proposed:
            https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg01654.html
            which teaches blockdev-mirror to first check whether the destination reads as all zeroes; if so, it no longer sends a write zero command over the wire.  That way, regardless of whether the destination is allowed to punch holes, the fact that redundant write zero requests are no longer sent over NBD means that the destination is no longer fully allocated.

             

            Eric Blake added a comment - The two lines that jumped out to me in Peter's review are: -blockdev ' {"driver":"file","filename":"/var/lib/libvirt/images/raw.img","node-name":"libvirt-1-storage","read-only":false} '-device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"device_id":"drive-scsi0-0-0-0","drive":"libvirt-1-storage","id":"scsi0-0-0-0"}' where there is no "discard":"unmap" on the destination (therefore, the destination inherits the default "discard":"ignore" which forces write zeroes to allocate rather than punching holes), and: {"execute":"blockdev-mirror","arguments": {"job-id":"drive-scsi0-0-0-0","device":"libvirt-1-storage","target":"migration-sda-storage","speed":9223372036853727232,"sync":"full","auto-finalize":true,"auto-dismiss":false} ,"id":"libvirt-473"} For the latter, I have an upstream patch series proposed: https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg01654.html which teaches blockdev-mirror to first check whether the destination reads as all zeroes; if so, it no longer sends a write zero command over the wire.  That way, regardless of whether the destination is allowed to punch holes, the fact that redundant write zero requests are no longer sent over NBD means that the destination is no longer fully allocated.  

            I was able to reproduce it with upstream setup and isolated it to the qemu version on destination.

            I've bisected it to:

            commit d05ae948cc887054495977855b0859d0d4ab2613
            Author: Nir Soffer <nsoffer@redhat.com>
            Date:   Fri Jun 28 23:20:58 2024 +0300
            
                Consider discard option when writing zeros
                
                When opening an image with discard=off, we punch hole in the image when
                writing zeroes, making the image sparse. This breaks users that want to
                ensure that writes cannot fail with ENOSPACE by using fully allocated
                images[1].
                
                bdrv_co_pwrite_zeroes() correctly disables BDRV_REQ_MAY_UNMAP if we
                opened the child without discard=unmap or discard=on. But we don't go
                through this function when accessing the top node. Move the check down
                to bdrv_co_do_pwrite_zeroes() which seems to be used in all code paths.
                
                This change implements the documented behavior, punching holes only when
                opening the image with discard=on or discard=unmap. This may not be the
                best default but can improve it later.
                
                The test depends on a file system supporting discard, deallocating the
                entire file when punching hole with the length of the entire file.
                Tested with xfs, ext4, and tmpfs.
                
                [1] https://lists.nongnu.org/archive/html/qemu-discuss/2024-06/msg00003.html
                
                Signed-off-by: Nir Soffer <nsoffer@redhat.com>
                Message-id: 20240628202058.1964986-3-nsoffer@redhat.com
                Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
            

            The justification on qemu's side about "punching holes" seems to make sense but the image that was given to qemu was already sparse so there was no hole that should have been punched, thus it is a regression in behaviour regardless.

            If you decide that the qemu behaviour is correct despite causing the regression and libvirt should install a workardound (most likely using blockdev-reopen after the migration) please re-assign appropriately.

            Workaround is simple (enable 'discard' as it doesn't make much sense anyways without it anyways).

            Peter Krempa added a comment - I was able to reproduce it with upstream setup and isolated it to the qemu version on destination. I've bisected it to: commit d05ae948cc887054495977855b0859d0d4ab2613 Author: Nir Soffer <nsoffer@redhat.com> Date: Fri Jun 28 23:20:58 2024 +0300 Consider discard option when writing zeros When opening an image with discard=off, we punch hole in the image when writing zeroes, making the image sparse. This breaks users that want to ensure that writes cannot fail with ENOSPACE by using fully allocated images[1]. bdrv_co_pwrite_zeroes() correctly disables BDRV_REQ_MAY_UNMAP if we opened the child without discard=unmap or discard=on. But we don't go through this function when accessing the top node. Move the check down to bdrv_co_do_pwrite_zeroes() which seems to be used in all code paths. This change implements the documented behavior, punching holes only when opening the image with discard=on or discard=unmap. This may not be the best default but can improve it later. The test depends on a file system supporting discard, deallocating the entire file when punching hole with the length of the entire file. Tested with xfs, ext4, and tmpfs. [1] https: //lists.nongnu.org/archive/html/qemu-discuss/2024-06/msg00003.html Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-id: 20240628202058.1964986-3-nsoffer@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> The justification on qemu's side about "punching holes" seems to make sense but the image that was given to qemu was already sparse so there was no hole that should have been punched, thus it is a regression in behaviour regardless. If you decide that the qemu behaviour is correct despite causing the regression and libvirt should install a workardound (most likely using blockdev-reopen after the migration) please re-assign appropriately. Workaround is simple (enable 'discard' as it doesn't make much sense anyways without it anyways).

            Thank you for the logs rhn-support-lcheng!

            TL;DR: The libvirt setup is identical in both the 9.5 and 9.6 cases, for both the source VM; setup of NBD client for migration and the migration itself.

            0) High level description of the migration

            Libvirt starts the destination as it would normally do for a VM except for enabling incoming migration mode. In terms of block storage the disks configured via commandline are exposed via NBD. The source then connects to it and mirrors selected disks. If the target storage exsits on the destination libvirt will use it without touching it.

            1) setup of the blockdev on destination of the migration

            rhel-9.5:

            -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/raw.img","node-name":"libvirt-1-storage","read-only":false}'
            -device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"device_id":"drive-scsi0-0-0-0","drive":"libvirt-1-storage","id":"scsi0-0-0-0"}'
            
            {"execute":"block-export-add","arguments":{"type":"nbd","id":"libvirt-nbd-libvirt-1-storage","node-name":"libvirt-1-storage","writable":true,"name":"drive-scsi0-0-0-0"},"id":"libvirt-446"}
            

            rhel-9.6:

            -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/raw.img","node-name":"libvirt-1-storage","read-only":false}'
            -device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"device_id":"drive-scsi0-0-0-0","drive":"libvirt-1-storage","id":"scsi0-0-0-0"}'
            
            {"execute":"block-export-add","arguments":{"type":"nbd","id":"libvirt-nbd-libvirt-1-storage","node-name":"libvirt-1-storage","writable":true,"name":"drive-scsi0-0-0-0"},"id":"libvirt-463"}
            

            2) source setup - mirroring

            rhel-9.5:

            {
              "execute": "blockdev-add",
              "arguments": {
                "driver": "nbd",
                "server": {
                  "type": "inet",
                  "host": "XXX1",
                  "port": "49153"
                },
                "export": "drive-scsi0-0-0-0",
                "node-name": "migration-sda-storage",
                "read-only": false,
                "discard": "unmap",
                "detect-zeroes": "unmap"
              },
              "id": "libvirt-455"
            }
            
            {"execute":"blockdev-mirror","arguments":{"job-id":"drive-scsi0-0-0-0","device":"libvirt-1-storage","target":"migration-sda-storage","speed":9223372036853727232,"sync":"full","auto-finalize":true,"auto-dismiss":false},"id":"libvirt-456"}
            

            rhel-9.6:

            {
              "execute": "blockdev-add",
              "arguments": {
                "driver": "nbd",
                "server": {
                  "type": "inet",
                  "host": "XXX2",
                  "port": "49153"
                },
                "export": "drive-scsi0-0-0-0",
                "node-name": "migration-sda-storage",
                "read-only": false,
                "discard": "unmap",
                "detect-zeroes": "unmap"
              },
              "id": "libvirt-472"
            }
            
            {"execute":"blockdev-mirror","arguments":{"job-id":"drive-scsi0-0-0-0","device":"libvirt-1-storage","target":"migration-sda-storage","speed":9223372036853727232,"sync":"full","auto-finalize":true,"auto-dismiss":false},"id":"libvirt-473"}
            

            3) source setup

            For reference the source configures the disk as:

            rhel-9.6

            -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/raw.img","node-name":"libvirt-1-storage","read-only":false}' 
            -device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"device_id":"drive-scsi0-0-0-0","drive":"libvirt-1-storage","id":"scsi0-0-0-0"}'
            

            the rhel-9.5 log didn't contain this, but it's identical to how the destination configures it and also shouldn't matter for the block job

            The above log snippets show that libvirt sets up the migration identically between the versions. I'll next try reproducing it locally, albeit with upstream code.

            Peter Krempa added a comment - Thank you for the logs rhn-support-lcheng ! TL;DR: The libvirt setup is identical in both the 9.5 and 9.6 cases, for both the source VM; setup of NBD client for migration and the migration itself. 0) High level description of the migration Libvirt starts the destination as it would normally do for a VM except for enabling incoming migration mode. In terms of block storage the disks configured via commandline are exposed via NBD. The source then connects to it and mirrors selected disks. If the target storage exsits on the destination libvirt will use it without touching it. 1) setup of the blockdev on destination of the migration rhel-9.5: -blockdev '{ "driver" : "file" , "filename" : "/ var /lib/libvirt/images/raw.img" , "node-name" : "libvirt-1-storage" , "read-only" : false }' -device '{ "driver" : "scsi-hd" , "bus" : "scsi0.0" , "channel" :0, "scsi-id" :0, "lun" :0, "device_id" : "drive-scsi0-0-0-0" , "drive" : "libvirt-1-storage" , "id" : "scsi0-0-0-0" }' { "execute" : "block-export-add" , "arguments" :{ "type" : "nbd" , "id" : "libvirt-nbd-libvirt-1-storage" , "node-name" : "libvirt-1-storage" , "writable" : true , "name" : "drive-scsi0-0-0-0" }, "id" : "libvirt-446" } rhel-9.6: -blockdev '{ "driver" : "file" , "filename" : "/ var /lib/libvirt/images/raw.img" , "node-name" : "libvirt-1-storage" , "read-only" : false }' -device '{ "driver" : "scsi-hd" , "bus" : "scsi0.0" , "channel" :0, "scsi-id" :0, "lun" :0, "device_id" : "drive-scsi0-0-0-0" , "drive" : "libvirt-1-storage" , "id" : "scsi0-0-0-0" }' { "execute" : "block-export-add" , "arguments" :{ "type" : "nbd" , "id" : "libvirt-nbd-libvirt-1-storage" , "node-name" : "libvirt-1-storage" , "writable" : true , "name" : "drive-scsi0-0-0-0" }, "id" : "libvirt-463" } 2) source setup - mirroring rhel-9.5: { "execute" : "blockdev-add" , "arguments" : { "driver" : "nbd" , "server" : { "type" : "inet" , "host" : "XXX1" , "port" : "49153" }, "export" : "drive-scsi0-0-0-0" , "node-name" : "migration-sda-storage" , "read-only" : false , "discard" : "unmap" , "detect-zeroes" : "unmap" }, "id" : "libvirt-455" } { "execute" : "blockdev-mirror" , "arguments" :{ "job-id" : "drive-scsi0-0-0-0" , "device" : "libvirt-1-storage" , "target" : "migration-sda-storage" , "speed" :9223372036853727232, "sync" : "full" , "auto-finalize" : true , "auto-dismiss" : false }, "id" : "libvirt-456" } rhel-9.6: { "execute" : "blockdev-add" , "arguments" : { "driver" : "nbd" , "server" : { "type" : "inet" , "host" : "XXX2" , "port" : "49153" }, "export" : "drive-scsi0-0-0-0" , "node-name" : "migration-sda-storage" , "read-only" : false , "discard" : "unmap" , "detect-zeroes" : "unmap" }, "id" : "libvirt-472" } { "execute" : "blockdev-mirror" , "arguments" :{ "job-id" : "drive-scsi0-0-0-0" , "device" : "libvirt-1-storage" , "target" : "migration-sda-storage" , "speed" :9223372036853727232, "sync" : "full" , "auto-finalize" : true , "auto-dismiss" : false }, "id" : "libvirt-473" } 3) source setup For reference the source configures the disk as: rhel-9.6 -blockdev '{ "driver" : "file" , "filename" : "/ var /lib/libvirt/images/raw.img" , "node-name" : "libvirt-1-storage" , "read-only" : false }' -device '{ "driver" : "scsi-hd" , "bus" : "scsi0.0" , "channel" :0, "scsi-id" :0, "lun" :0, "device_id" : "drive-scsi0-0-0-0" , "drive" : "libvirt-1-storage" , "id" : "scsi0-0-0-0" }' the rhel-9.5 log didn't contain this, but it's identical to how the destination configures it and also shouldn't matter for the block job The above log snippets show that libvirt sets up the migration identically between the versions. I'll next try reproducing it locally, albeit with upstream code.

            Liping Cheng added a comment - - edited

            On RHEL 9.5, cannot reproduce this issue. 
            The test logs are virtqemud.log-source-9.5 and virtqemud.log-target-9.5.

            The source host and target host have the same version.
            libvirt-10.5.0-7.5.el9_5.x86_64
            qemu-kvm-9.0.0-10.el9_5.2.x86_64
            kernel-5.14.0-503.31.1.el9_5.x86_64

            Test steps the same with RHEL 9.6.
            
            Final check raw image info on dst host:
             # qemu-img info /var/lib/libvirt/images/raw.img
            image: /var/lib/libvirt/images/raw.img
            file format: raw
            virtual size: 1 GiB (1073741824 bytes)
            disk size: 0 B
            Child node '/file':
            ...
             

            Liping Cheng added a comment - - edited On RHEL 9.5, cannot reproduce this issue.  The test logs are virtqemud.log-source-9.5 and virtqemud.log-target-9.5. The source host and target host have the same version. libvirt-10.5.0-7.5.el9_5.x86_64 qemu-kvm-9.0.0-10.el9_5.2.x86_64 kernel-5.14.0-503.31.1.el9_5.x86_64 Test steps the same with RHEL 9.6. Final check raw image info on dst host: # qemu-img info / var /lib/libvirt/images/raw.img image: / var /lib/libvirt/images/raw.img file format: raw virtual size: 1 GiB (1073741824 bytes) disk size: 0 B Child node '/file' : ...

            On RHEL 9.6, can reproduce this issue. 
            The test logs are virtqemud.log-source-9.6 and virtqemud.log-target-9.6.

            The source host and target host have the same version.
            libvirt-10.10.0-7.1.el9_6.x86_64
            qemu-kvm-9.1.0-15.el9_6.1.x86_64
            kernel-5.14.0-570.4.1.el9_6.x86_64

            Test steps:
            
            Create raw image on src and dst host:
            # qemu-img create -f raw /var/lib/libvirt/images/raw.img 1G
            
            Guest xml without "discard='unmap'":
            <disk type='file' device='disk'>
               <driver name='qemu' type='raw'/>
               <source file='/var/lib/libvirt/images/raw.img'/>
               <target dev='sda' bus='scsi'/>
               <address type='drive' controller='0' bus='0' target='0' unit='0'/>
            </disk>
            
            Zero out the sda block device in guest.
            # dd if=/dev/zero of=/dev/sda bs=512 count=2097152
            
            Check raw image info on src host.
            # qemu-img info /var/lib/libvirt/images/raw.img
            image: /var/lib/libvirt/images/raw.img
            file format: raw
            virtual size: 1 GiB (1073741824 bytes)
            disk size: 1 GiB
            ...
            
            Migrate VM.
            # virsh migrate vm1 qemu+tcp://XXX/system --verbose --live --p2p --copy-storage-all --migrate-disks-detect-zeroes sda
            
            Check raw image info on dst host:
            # qemu-img info /var/lib/libvirt/images/raw.img
            image: /var/lib/libvirt/images/raw.img
            file format: raw
            virtual size: 1 GiB (1073741824 bytes)
            disk size: 1 GiB
            ...
            
            

            Liping Cheng added a comment - On RHEL 9.6, can reproduce this issue.  The test logs are virtqemud.log-source-9.6 and virtqemud.log-target-9.6. The source host and target host have the same version. libvirt-10.10.0-7.1.el9_6.x86_64 qemu-kvm-9.1.0-15.el9_6.1.x86_64 kernel-5.14.0-570.4.1.el9_6.x86_64 Test steps: Create raw image on src and dst host: # qemu-img create -f raw / var /lib/libvirt/images/raw.img 1G Guest xml without "discard= 'unmap' " : <disk type= 'file' device= 'disk' >    <driver name= 'qemu' type= 'raw' />    <source file= '/ var /lib/libvirt/images/raw.img' />    <target dev= 'sda' bus= 'scsi' />    <address type= 'drive' controller= '0' bus= '0' target= '0' unit= '0' /> </disk> Zero out the sda block device in guest. # dd if =/dev/zero of=/dev/sda bs=512 count=2097152 Check raw image info on src host. # qemu-img info / var /lib/libvirt/images/raw.img image: / var /lib/libvirt/images/raw.img file format: raw virtual size: 1 GiB (1073741824 bytes) disk size: 1 GiB ... Migrate VM. # virsh migrate vm1 qemu+tcp: //XXX/system --verbose --live --p2p --copy-storage-all --migrate-disks-detect-zeroes sda Check raw image info on dst host: # qemu-img info / var /lib/libvirt/images/raw.img image: / var /lib/libvirt/images/raw.img file format: raw virtual size: 1 GiB (1073741824 bytes) disk size: 1 GiB ...

            (In reply to pkrempa@redhat.com from comment-26873876)

            The bug was described in this comment: https://issues.redhat.com/browse/RHEL-61176?focusedId=25758705&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-25758705

            rhn-support-lcheng can you please attach the version of libvirt on the source of the migration as well as the debug log of virtqemud from the source.

             
            Ideally for both when it's broken and when it's working (and in both sides the exact libvirt versions please).

            Peter Krempa added a comment - (In reply to pkrempa@redhat.com from comment-26873876 ) The bug was described in this comment: https://issues.redhat.com/browse/RHEL-61176?focusedId=25758705&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-25758705 rhn-support-lcheng can you please attach the version of libvirt on the source of the migration as well as the debug log of virtqemud from the source.   Ideally for both when it's broken and when it's working (and in both sides the exact libvirt versions please).

              eblake_redhat Eric Blake
              rhn-support-lcheng Liping Cheng
              virt-maint virt-maint
              Aihua Liang Aihua Liang
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: