-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
rhel-9.0.0
-
None
-
Moderate
-
rhel-sst-virtualization-networking
-
ssg_virtualization
-
3
-
False
-
-
None
-
None
-
None
-
None
-
Known Issue
-
-
Done
-
-
x86_64
-
None
Description of problem:
During the live migration of vm with VF, I can not get ping reply from guest immediately after the VF is hot-unplugged.(the vm didn't reach downtime at this time).
Version-Release number of selected component (if applicable):
Host:
4.18.0-147.3.1.el8_1.x86_64
qemu-kvm-4.1.0-20.module+el8.1.1+5309+6d656f05.x86_64
Guest:
4.18.0-147.3.1.el8_1.x86_64
How reproducible:
10/10
Steps to Reproduce:
1.On source host,create 82599ES VF and set the mac address of the VF
ip link set enp6s0f0 vf 0 mac 22:2b:62:bb:a9:82
2.start a source guest with 82599ES VF which enables failover
/usr/libexec/qemu-kvm -name rhel811 -M q35 -enable-kvm \
-monitor stdio \
-nodefaults \
-m 4G \
-boot menu=on \
-cpu Haswell-noTSX-IBRS \
-device pcie-root-port,id=root.1,chassis=1,addr=0x2.0,multifunction=on \
-device pcie-root-port,id=root.2,chassis=2,addr=0x2.1 \
-device pcie-root-port,id=root.3,chassis=3,addr=0x2.2 \
-device pcie-root-port,id=root.4,chassis=4,addr=0x2.3 \
-device pcie-root-port,id=root.5,chassis=5,addr=0x2.4 \
-device pcie-root-port,id=root.6,chassis=6,addr=0x2.5 \
-device pcie-root-port,id=root.7,chassis=7,addr=0x2.6 \
-device pcie-root-port,id=root.8,chassis=8,addr=0x2.7 \
-smp 2,sockets=1,cores=2,threads=2,maxcpus=4 \
-qmp tcp:0:6666,server,nowait \
-blockdev node-name=back_image,driver=file,cache.direct=on,cache.no-flush=off,filename=/nfsmount/migra_test/rhel811_q35.qcow2,aio=threads \
-blockdev node-name=drive-virtio-disk0,driver=qcow2,cache.direct=on,cache.no-flush=off,file=back_image \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=disk0,bus=root.1 \
-device VGA,id=video1,bus=root.2 \
-vnc :0 \
-netdev tap,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:2b:62:bb:a9:82,bus=root.3,failover=on \
-device vfio-pci,host=0000:06:10.0,id=hostdev0,bus=root.4,failover_pair_id=net0 \
3.check the network info in source guest
- ifconfig
enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.73.33.236 netmask 255.255.254.0 broadcast 10.73.33.255
ether 22:2b:62:bb:a9:82 txqueuelen 1000 (Ethernet)
RX packets 28683 bytes 1961744 (1.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 93 bytes 13770 (13.4 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp3s0nsby: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 22:2b:62:bb:a9:82 txqueuelen 1000 (Ethernet)
RX packets 28345 bytes 1924974 (1.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 22:2b:62:bb:a9:82 txqueuelen 1000 (Ethernet)
RX packets 339 bytes 36836 (35.9 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 95 bytes 14406 (14.0 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
- ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff
3: enp3s0nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master enp3s0 state UP mode DEFAULT group default qlen 1000
link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff
4: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master enp3s0 state UP mode DEFAULT group default qlen 1000
link/ether 22:2b:62:bb:a9:82 brd ff:ff:ff:ff:ff:ff
4.On target host,create NetXtreme BCM57810 VF and set the mac address of the VF
ip link set enp131s0f0 vf 0 mac 22:2b:62:bb:a9:82
5.start a target guest in listening mode in order to wait for migrating from source guest
...
-incoming tcp:0:5800 \
6.keep pinging the vm during the migration
- ping 10.73.33.236
7.Migrate guest from source host to target host.
(qemu) migrate -d tcp:10.73.73.73:5800
migrate guest successfully.
8.check ping output
- ping 10.73.33.236
64 bytes from 10.73.33.236: icmp_seq=59 ttl=61 time=3.07 ms
64 bytes from 10.73.33.236: icmp_seq=60 ttl=61 time=4.35 ms
64 bytes from 10.73.33.236: icmp_seq=61 ttl=61 time=2.10 ms
64 bytes from 10.73.33.236: icmp_seq=62 ttl=61 time=4.53 ms[1]
64 bytes from 10.73.33.236: icmp_seq=88 ttl=61 time=7.39 ms[2]
64 bytes from 10.73.33.236: icmp_seq=89 ttl=61 time=4.35 ms
64 bytes from 10.73.33.236: icmp_seq=90 ttl=61 time=5.82 ms
64 bytes from 10.73.33.236: icmp_seq=91 ttl=61 time=4.39 ms
[1]
when "virtio_net virtio1 enp3s0: failover primary slave:enp4s0 unregistered" is outputed in source guest vm dmesg,ping will not work until the migration is completed.
[2]
when migration is completed,ping works again.
Actual results:
when "virtio_net virtio1 enp3s0: failover primary slave:enp4s0 unregistered" is outputed in source guest vm dmesg,ping will not work until the migration is completed.
Expected results:
ping should always work during migration, because hypervisor will fail over to the virtio netdev datapath when the VF is unplugged.
Additional info:
(1)
- lspci | grep -i 82599
06:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
06:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
(2)
This problem can be reproduced with NetXtreme II BCM57810
(3)
This problem can be reproduced in RHEL82-AV
The test env info is as following:
host:
qemu-kvm-4.2.0-4.module+el8.2.0+5220+e82621dc.x86_64
4.18.0-167.el8.x86_64
guest:
4.18.0-167.el8.x86_64