-
Bug
-
Resolution: Unresolved
-
Critical
-
rhel-10.1
-
qemu-kvm-10.0.0-8.el10
-
No
-
Important
-
2
-
rhel-virt-storage
-
ssg_virtualization
-
24
-
5
-
False
-
False
-
-
None
-
virt-storage Sprint 6, Planning backlog
-
Pass
-
RegressionOnly
-
Unspecified
-
Unspecified
-
Unspecified
-
-
All
-
None
What were you trying to do that didn't work?
Based on an upstream report from Andrey Drobyshev
https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg04421.html
What is the impact of this issue to you?
The potential to cause qemu deadlocks when doing block jobs can interfere with live migration.
Please provide the package NVR for which the bug is seen:
How reproducible is this bug?:
Sporadic
Steps to reproduce
- Per Andrey's email:
1. Run QEMU:
> SRCDIR=/path/to/srcdir
>
> $SRCDIR/build/qemu-system-x86_64 -enable-kvm \
> -machine q35 -cpu Nehalem \
> -name guest=alma8-vm,debug-threads=on \
> -m 2g -smp 2 \
> -nographic -nodefaults \
> -qmp unix:/var/run/alma8-qmp.sock,server=on,wait=off \
> -serial unix:/var/run/alma8-serial.sock,server=on,wait=off \
> -object iothread,id=iothread0 \
> -blockdev node-name=disk,driver=qcow2,file.driver=file,file.filename=/path/to/img/alma8.qcow2 \
> -device virtio-blk-pci,drive=disk,iothread=iothread02. Launch IO (random reads) from within the guest:
> nc -U /var/run/alma8-serial.sock
> ...
> [root@alma8-vm ~]# fio --name=randread --ioengine=libaio --direct=1 --bs=4k --size=1G --numjobs=1 --time_based=1 --runtime=300
+--group_reporting --rw=randread --iodepth=1 --filename=/testfile3. Run snapshots creation & removal of lower snapshot operation in a
loop (script attached):
> while /bin/true ; do ./remove_lower_snap.sh ; done- where remove_lower_snap.sh is:
#!/bin/bash
SRCDIR=/path/to/srcdir
STORDIR=/path/to/img
SNAP1=$STORDIR/snap1.qcow2
SNAP2=$STORDIR/snap2.qcow2
QMPSHELL=$SRCDIR/scripts/qmp/qmp-shell
QMPSOCK=/var/run/alma8-qmp.sockfunction qmp_filter()
Unknown macro: { sed -r '/^(Welcome|Connected)/d' }function waitjob()
Unknown macro: { jobid=$1 while /bin/true ; do qbjout=$($QMPSHELL -p $QMPSOCK <<EOF query-block-jobs EOF ) jobstatus=$(echo "$qbjout" | grep '"status"' | head -1 | awk '{print $2}' | sed 's/[",]//g')
if [ "x${jobstatus}" == "xready" ] ; then
echo -e "\n######### Complete job $jobid #########\n"
$QMPSHELL -p $QMPSOCK <<EOF | qmp_filter
job-complete id=$jobid
EOF
elif [ "x${jobstatus}" == "xconcluded" ] ; then
echo -e "\n######### Dismiss job $jobid #########\n"
$QMPSHELL -p $QMPSOCK <<EOF | qmp_filter
job-dismiss id=$jobid
EOF
elif [ "x${jobstatus}" == "x" ] ; then
break
fisleep 0.5
done
}echo -e "\n######### Create snapshot images #########\n"
qemu-img create -f qcow2 $SNAP1 16G
qemu-img create -f qcow2 $SNAP2 16Gecho -e "\n######### Create 1st snapshot #########\n"
$QMPSHELL -p $QMPSOCK <<EOF | qmp_filter
blockdev-add driver=qcow2 node-name=snap1 file={"driver":"file","filename":"$SNAP1"}
blockdev-snapshot node=disk overlay=snap1
EOFecho -e "\n######### Create 2nd snapshot #########\n"
$QMPSHELL -p $QMPSOCK <<EOF | qmp_filter
blockdev-add driver=qcow2 node-name=snap2 file={"driver":"file","filename":"$SNAP2"}
blockdev-snapshot node=snap1 overlay=snap2
EOFecho -e "\n######### Commit lower snapshot #########\n"
$QMPSHELL -p $QMPSOCK <<EOF | qmp_filter
block-commit device=snap2 top-node=snap1 base-node=disk auto-finalize=true auto-dismiss=false job-id=commit-snap1
EOFwaitjob commit-snap1
echo -e "\n######### Commit remaining snapshot #########\n"
$QMPSHELL -p $QMPSOCK <<EOF | qmp_filter
block-commit device=snap2 top-node=snap2 base-node=disk auto-finalize=true auto-dismiss=false job-id=commit-snap2
EOFwaitjob commit-snap2
echo -e "\n######### Remove unneeded snapshot nodes #########\n"
$QMPSHELL -p $QMPSOCK <<EOF | qmp_filter
blockdev-del node-name=snap1
blockdev-del node-name=snap2
EOFecho -e "\n######### Done! #########\n"
Expected results
No hang
Actual results
Running the script in a loop can hit deadlock
- links to
-
RHBA-2025:147447 qemu-kvm bug fix and enhancement update