Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-867

[CX7] Excessive TCP retransmissions and low tput observed when using CX7 card as transmitter

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • None
    • openvswitch3.3
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Version info:

      openvswitch3.3-3.3.0-8.el10fdp.x86_64.rpm, RHEL-10.0-20241002.2

      [root@wsfd-advnetlab33 ~]# uname -r
      6.11.0-0.rc5.23.el10.x86_64

      CX7 card info:

      [root@wsfd-advnetlab33 ~]# ethtool -i enp175s0f0np0
      driver: mlx5_core
      version: 6.11.0-0.rc5.23.el10.x86_64
      firmware-version: 28.98.2402 (MT_0000000841)

      This problem appears to only occur when the topo client/DUT (i.e. traffic transmitter) is using a CX7 card for a geneve IPv6 endpoint and happens on both bare metal and VM tests.

      I originally logged this (or a similar) issue for RHEL-8 and 9 in https://bugzilla.redhat.com/show_bug.cgi?id=2108290 so I suspect this issue could be card/driver related. 

      Same issue being tracked for RHEL-9/OVS 3.1: https://issues.redhat.com/browse/FDP-868

      sos report: https://netqe-infra01.knqe.eng.rdu2.dc.redhat.com/sosreports/sosreport-wsfd-advnetlab33-2024-10-12-kmbtbrz.tar.xz

      ovs logs are attached to this bug.

      Beaker job: https://beaker.engineering.redhat.com/jobs/10004668

      Log showing detailed results using multiple frame sizes and MTUs: https://beaker-archive.prod.engineering.redhat.com/beaker-logs/2024/10/100046/10004668/17202273/185258351/864798914/resultoutputfile.log

      Steps to reproduce:

      Install wireshark and iperf3

      Configuration:

      Server (CX5-EX card as geneve IIPv6 endpoint)

      ip link set mtu 1500 dev enp216s0f0np0
      ip addr show enp216s0f0np0
      ip addr add 0/24 dev enp216s0f0np0
      ip addr add 2001:0db8:251::2/64 dev enp216s0f0np0
      ip addr show enp216s0f0np0
      ip link add geneve0 type geneve id 88 remote 2001:0db8:251::1
      ip link set geneve0 up
      ovs-vsctl add-port ovsbr0 geneve0
      sleep 1
      ip link set mtu 1430 dev ovsbr0
      ip link set mtu 1430 dev geneve0
      ovs-vsctl show
      ip addr add 172.31.252.2/24 dev ovsbr0
      ip addr add 2001:0db8:252::2/64 dev ovsbr0
      ip -d link show enp216s0f0np0 
      ip -d link show geneve0
      ip -d link show ovsbr0
      ip addr show enp216s0f0np0
      ip addr show ovsbr0

      Launch iperf3 server

      iperf3 -s &

      Client (DUT using CX7 card as geneve IPv6 endpoint)

      ip link set mtu 1500 dev enp175s0f0np0
      ip addr show enp175s0f0np0
      ip addr add 0/24 dev enp175s0f0np0
      ip addr add 2001:0db8:251::1/64 dev enp175s0f0np0
      ip addr show enp175s0f0np0
      ip link add geneve0 type geneve id 88 remote 2001:0db8:251::2
      ip link set geneve0 up
      ovs-vsctl add-port ovsbr0 geneve0
      sleep 1
      ip link set mtu 1430 dev ovsbr0
      ip link set mtu 1430 dev geneve0
      ovs-vsctl show
      ip addr add 172.31.252.1/24 dev ovsbr0
      ip addr add 2001:0db8:252::1/64 dev ovsbr0
      ip -d link show enp175s0f0np0
      ip -d link show geneve0
      ip -d link show ovsbr0
      ip addr show enp175s0f0np0
      ip addr show ovsbr0

      Test connectitvity:
      ping -c3 172.31.252.2

      Launch tshark capture:

      tshark -i ovsbr0 -f 'src 172.31.252.1 and tcp' -w /home/tshark.pcap &
      sleep 1

      Launch iperf3 client to transmit TCP traffic for 1 second:
      iperf3 -4 -c 172.31.252.2 -l 64 -t 1

      Connecting to host 172.31.252.2, port 5201
      [  5] local 172.31.252.1 port 36796 connected to 172.31.252.2 port 5201
      310 [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
      [  5]   0.00-1.00   sec   647 KBytes  5.29 Mbits/sec   70   2.69 KBytes       

      • - - - - - - - - - - - - - - - - - - - - - - - -
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-1.00   sec   647 KBytes  5.29 Mbits/sec   70             sender
        [  5]   0.00-1.00   sec   409 KBytes  3.34 Mbits/sec                  receiver

      iperf Done.

      tshark_retransmits=$(tshark -r /home/tshark.pcap -z 'io,stat,0,frame.len && tcp.analysis.retransmission || tcp.analysis.fast_retransmission || tcp.analysis.spurious_retransmission' | grep -A2 Bytes | tail -1 | awk '{print $6}')

      sender_total_frames=$(tshark -r /home/tshark.pcap -z 'io,stat,0,frame.len && ip.src == 172.31.252.1' | grep -A2 Bytes | tail -1 | awk '{print $6}')

      iperf3 retransmits: 70
      tshark_retransmits=65
      sender_total_frames=429

      iperf3 reported retransmit: 16.32% (70/429) Test sets TCP retransmit threshold at 2%
      tshark reported retransmit: 15.15% (65/429) Test sets TCP retransmit threshold at 2%
      iperf3 reported "goodput":  63.21% (409/647) Test sets iperf3 goodput threshold at 90%

              ovsdpdk-triage ovsdpdk triage
              ralongi@redhat.com Rick Alongi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: