-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
rhel-9.5
-
Yes
-
None
-
Regression
-
1
-
rhel-sst-virtualization-storage
-
ssg_virtualization
-
3
-
False
-
-
None
-
Test Sprint 2- for RHEL test
-
None
-
None
-
-
x86_64
-
None
What were you trying to do that didn't work?
It reports "internal error: unable to execute QEMU command 'blockdev-del': Block device libvirt-1-storage is in use" in virtqemud.log after executing "virsh blockjob --pivot". And 'dd' and 'fdisk -l' commands hang in the guest after blockcopy and block job pivoting.
Please provide the package NVR for which bug is seen:
libvirt-10.3.0-2.el9.x86_64
qemu-kvm-9.0.0-3.el9.x86_64
kernel-5.14.0-452.el9.x86_64
How reproducible:
100%
Steps to reproduce
1. Prepare two simulated vhost-vdpa disks on host:
modprobe vhost_vdpa
modprobe vdpa_sim_blk
vdpa dev add mgmtdev vdpasim_blk name vdpa0
vdpa dev add mgmtdev vdpasim_blk name vdpa1
2. Define a guest with vhostvdpa disk and shared memory.
<memoryBacking>
<access mode='shared'/>
</memoryBacking>
...
<disk type='vhostvdpa' device='disk'>
<driver name='qemu' type='raw' cache='none' io='threads' copy_on_read='on' discard='unmap' detect_zeroes='on'/>
<source dev='/dev/vhost-vdpa-0'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
</disk>
3. Start the vm and check the qemu cmd
-add-fd set=0,fd=20,opaque=libvirt-1-storage-vdpa -blockdev \{"driver":"virtio-blk-vhost-vdpa","path":"/dev/fdset/0","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap","cache":{"direct":true,"no-flush":false}} -blockdev \{"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","detect-zeroes":"on","cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"} -blockdev \{"driver":"copy-on-read","node-name":"libvirt-CoR-vdb","file":"libvirt-1-format","discard":"unmap"} -device \{"driver":"virtio-blk-pci","bus":"pci.7","addr":"0x0","drive":"libvirt-CoR-vdb","id":"virtio-disk1","write-cache":"on"}
4. Exec virsh blockcopy
- cat /tmp/xml_utils_temp_rwbbxhvf.xml
<disk type="vhostvdpa" device="disk"><target dev="vdb" bus="virtio" /><source dev="/dev/vhost-vdpa-1" /><driver name="qemu" type="raw" cache="none" io="threads" copy_on_read="on" discard="unmap" detect_zeroes="on" /></disk> - virsh blockcopy avocado-vt-vm1 vdb --xml /tmp/xml_utils_temp_rwbbxhvf.xml --transient-job --wait --verbose
5. virsh blockjob avocado-vt-vm1 vdb --pivot
6. login to vm and run 'dd' or 'fdisk -l'
- dd if=/dev/zero of=/dev/vdb bs=10
Expected results
There is no error during blockcopy; 'dd' and 'fdisk -l' commands completed without error.
Actual results
1. Got errors like below after 'virsh blockjob avocado-vt-vm1 vdb --pivot' in step 5:
- virsh blockcopy avocado-vt-vm1 vdb --xml /tmp/disk.xml --transient-job --wait --verbose
Block Copy: [100.00 %]
Now in mirroring phase - virsh blockjob avocado-vt-vm1 vdb --pivot
# grep -i erro /var/lib/avocado/job-results/job-2024-05-31T10.38-af32774/test-results/1-type_specific.io-github-autotest-libvirt.virtual_disks.vhostvdpa.define_start_multi_vms_with_same_disk.update_arp_rule/libvirtd.log 2024-06-03 07:39:44.775+0000: 11482: info : qemuMonitorJSONIOProcessLine:210 : QEMU_MONITOR_RECV_REPLY: mon=0x7f55e4043180 reply={"id": "libvirt-457", "error": {"class": "GenericError", "desc": "Block device libvirt-1-format is in use"}} 2024-06-03 07:39:44.776+0000: 8053: error : qemuMonitorJSONCheckErrorFull:366 : internal error: unable to execute QEMU command 'blockdev-del': Block device libvirt-1-format is in use 2024-06-03 07:39:44.777+0000: 11482: info : qemuMonitorJSONIOProcessLine:210 : QEMU_MONITOR_RECV_REPLY: mon=0x7f55e4043180 reply={"id": "libvirt-458", "error": {"class": "GenericError", "desc": "Block device libvirt-1-storage is in use"}} 2024-06-03 07:39:44.778+0000: 8053: error : qemuMonitorJSONCheckErrorFull:366 : internal error: unable to execute QEMU command 'blockdev-del': Block device libvirt-1-storage is in use
2. 'dd' command hangs in step 6
[root@localhost ~]# dd if=/dev/zero of=/dev/vdb bs=10 [ 246.952532] INFO: task dd:1314 blocked for more than 122 seconds. [ 246.952956] Not tainted 5.14.0-452.el9.x86_64 #1 [ 246.953164] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.953543] task:dd state:D stack:0 pid:1314 tgid:1314 ppid:1287 flags:0x00000002 [ 246.953814] Call Trace: [ 246.953975] <TASK> [ 246.954134] __schedule+0x21b/0x550 [ 246.954546] schedule+0x2e/0xd0 [ 246.954749] io_schedule+0x42/0x70 [ 246.954954] bit_wait_io+0xd/0x60 [ 246.955212] __wait_on_bit+0x48/0x150 [ 246.955459] ? __pfx_bit_wait_io+0x10/0x10 [ 246.955702] out_of_line_wait_on_bit+0x92/0xb0 [ 246.955909] ? __pfx_wake_bit_function+0x10/0x10 [ 246.956156] __block_write_begin_int+0x1fd/0x370 [ 246.956468] iomap_write_begin+0x1cb/0x2b0 [ 246.956716] iomap_write_iter+0xfb/0x250 [ 246.956927] ? do_user_addr_fault+0x1d6/0x6a0 [ 246.957156] iomap_file_buffered_write+0x83/0xe0 [ 246.957430] blkdev_write_iter+0x1b9/0x270 [ 246.957669] vfs_write+0x2cb/0x410 [ 246.957924] ksys_write+0x5f/0xe0 [ 246.958133] do_syscall_64+0x5c/0xf0 [ 246.958395] ? exc_page_fault+0x62/0x150 [ 246.958638] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 246.958871] RIP: 0033:0x7f7f520fda57 [ 246.959443] RSP: 002b:00007ffd6b53bd38 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 246.959673] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f7f520fda57 [ 246.959857] RDX: 000000000000000a RSI: 000055d4cb140000 RDI: 0000000000000001 [ 246.960003] RBP: 000055d4cb140000 R08: 000055d4cb140000 R09: 0000000000000077 [ 246.960128] R10: 000000000000006b R11: 0000000000000246 R12: 0000000000000000 [ 246.960253] R13: 0000000000000000 R14: 0000000000000000 R15: 000055d4cb140000 [ 246.960417] </TASK>
Additional info:
reproduced on RHEL9.4:
libvirt-10.0.0-6.3.el9_4.x86_64
qemu-kvm-8.2.0-11.el9_4.2.x86_64
kernel-5.14.0-427.1.1.el9_4.x86_64