Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-1666

Incorrect IP checksums with virtio ports

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • rhel-9
    • openvswitch3.6
    • None
    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • openvswitch3.6-3.6.0-6.el9fdp
    • rhel-9
    • None
    • rhel-net-ovs-dpdk
    • ssg_networking
    • OVS/DPDK - Sprint 9 - East
    • 1

       Problem Description: Clearly explain the issue.

      This issue has been identified in a OCP Virt deployment.

      A CP node deployed as a virtual machine had no connectivity with another local virtual machine.
      The hypervisor networking involved a simple bridging of tap interfaces.

      In the CP node itself, the main interface (a virtio-net device) was plugged in a OVS userspace bridge with TSO enabled.

      Looking (from the hypervisor side) at the packets received and transmitted showed that the IP/TCP traffic sent by the CP node had wrong IP and TCP checksums.

      SYN packet (all good) sent by local vm (tap3) to CP node (tap6):

      14:05:40.527081 tap3  P   ifindex 14 52:54:00:aa:bb:13 ethertype IPv4 (0x0800), length 80: (tos 0x10, ttl 64, id 51169, offset 0, flags [DF], proto TCP (6), length 60)                                                                       
          192.168.158.28.56904 > 192.168.158.30.22: Flags [S], cksum 0xbdba (incorrect -> 0xb5fd), seq 866637909, win 64240, options [mss 1460,sackOK,TS val 3082041205 ecr 0,nop,wscale 7], length 0                                               
      14:05:40.527088 tap6  Out ifindex 21 52:54:00:aa:bb:13 ethertype IPv4 (0x0800), length 80: (tos 0x10, ttl 64, id 51169, offset 0, flags [DF], proto TCP (6), length 60)                                                                       
          192.168.158.28.56904 > 192.168.158.30.22: Flags [S], cksum 0xb5fd (correct), seq 866637909, win 64240, options [mss 1460,sackOK,TS val 3082041205 ecr 0,nop,wscale 7], length 0                                                           
      

      SYN+ACK reply packet (IP csum == 0 !) sent by CP node (tap6) to local vm (tap3):

      14:05:40.528387 tap6  P   ifindex 21 52:54:00:aa:bb:0e ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->7d30)!)                                                     
          192.168.158.30.22 > 192.168.158.28.56904: Flags [S.], cksum 0xbdba (incorrect -> 0x40ab), seq 293020318, ack 866637910, win 65160, options [mss 1460,sackOK,TS val 3141108314 ecr 3082041205,nop,wscale 7], length 0                      
      14:05:40.528394 tap3  Out ifindex 14 52:54:00:aa:bb:0e ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->7d30)!)                                                     
          192.168.158.30.22 > 192.168.158.28.56904: Flags [S.], cksum 0xbdba (incorrect -> 0x40ab), seq 293020318, ack 866637910, win 65160, options [mss 1460,sackOK,TS val 3141108314 ecr 3082041205,nop,wscale 7], length 0        
      

      I suspect the problem is only noticed as the backend in the hypervisor is vhost-net and not vhost-user.

      A workaround is disabling all kind of checksum offloads at the virtio level (in the vm xml).

       Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).

      All versions of OVS affected.

              rhn-support-dmarchan David Marchand
              rhn-support-dmarchan David Marchand
              Ting Li Ting Li
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: