-
Sub-task
-
Resolution: Done-Errata
-
Undefined
-
None
-
None
-
None
-
0
-
False
-
False
-
openvswitch3.5-3.5.2-50.el9fdp
-
rhel-9
-
rhel-net-ovs-dpdk
-
-
-
ssg_networking
-
OVS/DPDK - Sprint 9 - East, OVS/DPDK - Sprint 10 - East
-
2
Problem Description: Clearly explain the issue.
This issue has been identified in a OCP Virt deployment.
A CP node deployed as a virtual machine had no connectivity with another local virtual machine.
The hypervisor networking involved a simple bridging of tap interfaces.
/--------------------\ /-------------------------\ /----------\ | CP virtual machine | | Linux host | | Other VM | | | | | | | | br-ex -- dpdk0 --+----+-- tap6 -- br0 -- tap3 --+----+-- eth0 | | | | | | | \--------------------/ \-------------------------/ \----------/
In CP virtual machine, OVS DPDK is installed and configured:
- dpdk0 is a dpdk port using the virtio net PCI device that needs to be bound to vfio-pci,
- br-ex is a userspace type bridge,
- userspace TSO is enabled,
In the hypervisor Linux host, br0 is simply a kernel standard bridge. No DPDK, No OVS involved in the host.
In Other VM, eth0 is a simple virtio net PCI device bound to the kernel driver. No DPDK involved in this virtual machine.
Traffic is sent from CP VM br-ex, to Other VM eth0 iface on the 192.168.158.0/24 subnet.
Looking (from the hypervisor side) at the packets received and transmitted showed that the IP/TCP traffic sent by the CP node had wrong IP and TCP checksums.
SYN packet (all good) sent by local vm (tap3) to CP node (tap6):
14:05:40.527081 tap3 P ifindex 14 52:54:00:aa:bb:13 ethertype IPv4 (0x0800), length 80: (tos 0x10, ttl 64, id 51169, offset 0, flags [DF], proto TCP (6), length 60)
192.168.158.28.56904 > 192.168.158.30.22: Flags [S], cksum 0xbdba (incorrect -> 0xb5fd), seq 866637909, win 64240, options [mss 1460,sackOK,TS val 3082041205 ecr 0,nop,wscale 7], length 0
14:05:40.527088 tap6 Out ifindex 21 52:54:00:aa:bb:13 ethertype IPv4 (0x0800), length 80: (tos 0x10, ttl 64, id 51169, offset 0, flags [DF], proto TCP (6), length 60)
192.168.158.28.56904 > 192.168.158.30.22: Flags [S], cksum 0xb5fd (correct), seq 866637909, win 64240, options [mss 1460,sackOK,TS val 3082041205 ecr 0,nop,wscale 7], length 0
SYN+ACK reply packet (IP csum == 0 !) sent by CP node (tap6) to local vm (tap3):
14:05:40.528387 tap6 P ifindex 21 52:54:00:aa:bb:0e ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->7d30)!)
192.168.158.30.22 > 192.168.158.28.56904: Flags [S.], cksum 0xbdba (incorrect -> 0x40ab), seq 293020318, ack 866637910, win 65160, options [mss 1460,sackOK,TS val 3141108314 ecr 3082041205,nop,wscale 7], length 0
14:05:40.528394 tap3 Out ifindex 14 52:54:00:aa:bb:0e ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->7d30)!)
192.168.158.30.22 > 192.168.158.28.56904: Flags [S.], cksum 0xbdba (incorrect -> 0x40ab), seq 293020318, ack 866637910, win 65160, options [mss 1460,sackOK,TS val 3141108314 ecr 3082041205,nop,wscale 7], length 0
I suspect the problem is only noticed as the backend in the hypervisor is vhost-net and not vhost-user.
A workaround is disabling all kind of checksum offloads at the virtio level (in the vm xml).
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
All versions of OVS affected.
- links to
-
RHBA-2025:154848
openvswitch3.5 bug fix and enhancement update