-
Bug
-
Resolution: Unresolved
-
Normal
-
rhel-10.0.beta
-
None
-
Moderate
-
rhel-sst-virtualization-storage
-
ssg_virtualization
-
8
-
QE ack
-
False
-
-
None
-
Red Hat Enterprise Linux
-
None
-
-
x86_64
-
Windows
-
None
Clone from RHEL-49790 [{*}Planning{*}] : virtio_error() deadlocks by taking BQL in iothread
RHEL-30894 describes a case where a virtio-blk iothread sees an invalid virtqueue state on shutdown. It then calls virtio_error(), which attempts to take BQL. This results in a deadlock because the main thread is holding the BQL and trying to shut down the virtio-blk device:
Thread 3 (Thread 0x7f609fe006c0 (LWP 20651) "qemu-system-x86"): #0 futex_wait (futex_word=0x5612d225bb28 <bql>, expected=2, private=0) at ../sysdeps/nptl/futex-internal.h:146 #1 __GI___lll_lock_wait (futex=futex@entry=0x5612d225bb28 <bql>, private=0) at lowlevellock.c:49 #2 0x00007f60abc48b01 in lll_mutex_lock_optimized (mutex=0x5612d225bb28 <bql>) at pthread_mutex_lock.c:48 #3 ___pthread_mutex_lock (mutex=0x5612d225bb28 <bql>) at pthread_mutex_lock.c:93 #4 0x00005612d11334e5 in qemu_mutex_lock_impl (mutex=0x5612d225bb28 <bql>, file=0x5612d1249b9b "../system/physmem.c", line=2696) at ../util/qemu-thread-posix.c:94 #5 0x00005612d0a6c117 in bql_lock_impl (file=0x5612d1249b9b "../system/physmem.c", line=2696) at ../system/cpus.c:536 #6 0x00005612d0db3d32 in prepare_mmio_access (mr=0x5613023a7360) at ../system/physmem.c:2696 #7 0x00005612d0db6a97 in address_space_stl_internal (as=0x56130226e690, addr=4276093720, val=0, attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at ../system/memory_ldst.c.inc:318 #8 0x00005612d0db6c07 in address_space_stl_le (as=0x56130226e690, addr=4276093720, val=0, attrs=..., result=0x0) at ../system/memory_ldst.c.inc:357 #9 0x00005612d0914bdf in pci_msi_trigger (dev=0x56130226e450, msg=...) at ../hw/pci/pci.c:364 #10 0x00005612d090963b in msi_send_message (dev=0x56130226e450, msg=...) at ../hw/pci/msi.c:380 #11 0x00005612d090b3fd in msix_notify (dev=0x56130226e450, vector=0) at ../hw/pci/msix.c:542 #12 0x00005612d0a09156 in virtio_pci_notify (d=0x56130226e450, vector=0) at ../hw/virtio/virtio-pci.c:77 #13 0x00005612d0d5366b in virtio_notify_vector (vdev=0x561302276870, vector=0) at ../hw/virtio/virtio.c:2014 #14 0x00005612d0d54e4d in virtio_notify_config (vdev=0x561302276870) at ../hw/virtio/virtio.c:2544 #15 0x00005612d0d4f740 in virtio_error (vdev=0x561302276870, fmt=0x5612d123f005 "Guest says index %u is available") at ../hw/virtio/virtio.c:3725 #16 0x00005612d0d5aa20 in virtqueue_get_head (vq=0x561302281790, idx=397, head=0x7f609fdf545c) at ../hw/virtio/virtio.c:1048 #17 0x00005612d0d523b5 in virtqueue_split_pop (vq=0x561302281790, sz=240) at ../hw/virtio/virtio.c:1541 #18 0x00005612d0d519a9 in virtqueue_pop (vq=0x561302281790, sz=240) at ../hw/virtio/virtio.c:1796 #19 0x00005612d0cdafc6 in virtio_blk_get_request (s=0x561302276870, vq=0x561302281790) at ../hw/block/virtio-blk.c:177 #20 0x00005612d0cdaea7 in virtio_blk_handle_vq (s=0x561302276870, vq=0x561302281790) at ../hw/block/virtio-blk.c:988 #21 0x00005612d0ce1163 in virtio_blk_handle_output (vdev=0x561302276870, vq=0x561302281790) at ../hw/block/virtio-blk.c:1022 #22 0x00005612d0d5784f in virtio_queue_notify_vq (vq=0x561302281790) at ../hw/virtio/virtio.c:2299 #23 0x00005612d0d57641 in virtio_queue_host_notifier_aio_poll_ready (n=0x561302281804) at ../hw/virtio/virtio.c:3591 #24 0x00005612d112de6c in aio_dispatch_handler (ctx=0x5613010264a0, node=0x7f5e74006680) at ../util/aio-posix.c:356 #25 0x00005612d112da05 in aio_dispatch_ready_handlers (ctx=0x5613010264a0, ready_list=0x7f609fdfb7b0) at ../util/aio-posix.c:401 #26 0x00005612d112d6bf in aio_poll (ctx=0x5613010264a0, blocking=true) at ../util/aio-posix.c:723 #27 0x00005612d0efc3ec in iothread_run (opaque=0x5613003c2800) at ../iothread.c:63 #28 0x00005612d1134ad4 in qemu_thread_start (args=0x561301026b80) at ../util/qemu-thread-posix.c:541 #29 0x00007f60abc45507 in start_thread (arg=<optimized out>) at pthread_create.c:447 #30 0x00007f60abcc940c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
Thread 1 (Thread 0x7f60a4e0bd40 (LWP 20649) "qemu-system-x86"): #0 0x00007f60abcbbbb0 in __GI_ppoll (fds=0x5613003ddaf0, nfds=1, timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42 #1 0x00005612d115f1c0 in qemu_poll_ns (fds=0x5613003ddaf0, nfds=1, timeout=-1) at ../util/qemu-timer.c:339 #2 0x00005612d112f2cc in fdmon_poll_wait (ctx=0x561300d46c60, ready_list=0x7ffd35a29210, timeout=-1) at ../util/fdmon-poll.c:79 #3 0x00005612d112d3be in aio_poll (ctx=0x561300d46c60, blocking=true) at ../util/aio-posix.c:670 #4 0x00005612d1164f54 in aio_wait_bh_oneshot (ctx=0x5613010264a0, cb=0x5612d0ce2770 <virtio_blk_ioeventfd_stop_vq_bh>, opaque=0x561302281790) at ../util/aio-wait.c:85 #5 0x00005612d0ce0ec2 in virtio_blk_stop_ioeventfd (vdev=0x561302276870) at ../hw/block/virtio-blk.c:1777 #6 0x00005612d0a078a8 in virtio_bus_stop_ioeventfd (bus=0x5613022767f0) at ../hw/virtio/virtio-bus.c:259 #7 0x00005612d0a0beac in virtio_pci_stop_ioeventfd (proxy=0x56130226e450) at ../hw/virtio/virtio-pci.c:380 #8 0x00005612d0a09d33 in virtio_pci_vmstate_change (d=0x56130226e450, running=false) at ../hw/virtio/virtio-pci.c:1355 #9 0x00005612d0d569f9 in virtio_vmstate_change (opaque=0x561302276870, running=false, state=RUN_STATE_SHUTDOWN) at ../hw/virtio/virtio.c:3243 #10 0x00005612d0a7e141 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../system/runstate.c:380 #11 0x00005612d0a6b8fd in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../system/cpus.c:301 #12 0x00005612d0a6b870 in vm_shutdown () at ../system/cpus.c:319 #13 0x00005612d0a7ef12 in qemu_cleanup (status=0) at ../system/runstate.c:876 #14 0x00005612d103540f in qemu_default_main () at ../system/main.c:38 #15 0x00005612d1035448 in main (argc=57, argv=0x7ffd35a295b8) at ../system/main.c:48
I reproduced the problem with a slightly simplified (but likely not minimal yet) command line compared to the report in RHEL-30894:
./x86_64-softmmu/qemu-system-x86_64 \ -name SUTINT9402564 \ -cpu Broadwell-noTSX,vmx=on,hv_stimer,hv_synic,hv_time,hv_vpindex,hv_relaxed,hv_spinlocks=0xfff,hv_vapic,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,hv-vendor-id=KVMtest \ -enable-kvm -nodefaults \ -m 8G -smp 6,cores=6 \ -k en-us -boot menu=on \ -uuid 65d4d34b-dc6f-4a52-b9f9-8a3fce3af235 \ -device piix3-usb-uhci,id=usb -device usb-tablet,id=tablet0 \ -rtc base=localtime,clock=host,driftfix=slew \ -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3 \ -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x3.0x1 \ -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x3.0x2 \ -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x3.0x3 \ -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x3.0x4 \ -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/home/kwolf/images/Fedora-Cloud-Base-39-1.5.x86_64.qcow2,node-name=system_file \ -blockdev driver=qcow2,node-name=drive_system_disk,file=system_file \ -object iothread,id=thread0 \ -device virtio-blk-pci,iothread=thread0,drive=drive_system_disk,id=virtio-disk0,bootindex=1,bus=pci.4,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on \ -vga std -monitor stdio \ -blockdev node-name=file_ovmf_code,driver=file,filename=pc-bios/edk2-x86_64-secure-code.fd,auto-read-only=on,discard=unmap -blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \ -blockdev node-name=file_ovmf_vars,driver=file,filename=pc-bios/edk2-i386-vars.fd,auto-read-only=on,discard=unmap -blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \ -machine q35,kernel-irqchip=split,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \ -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=off,eim=on
In order to trigger the problem, the iommu must be enabled on the guest kernel command line with intel_iommu=on.
- clones
-
RHEL-49790 virtio_error() deadlocks by taking BQL in iothread [rhel-9.5]
- Planning
- is triggered by
-
RHEL-30894 Guest power off failed with iommu_platform=on and enable iothread [rhel-9.5]
- Planning