Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-868

[CX7] Excessive TCP retransmissions and low tput observed when using CX7 card as transmitter

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • None
    • openvswitch3.1
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Version info:

      openvswitch3.1-3.1.0-130.el9fdp.x86_64.rpm, RHEL-9.2.0-updates-20241012.1

      [root@wsfd-advnetlab33 ~]# uname -r
      5.14.0-284.89.1.el9_2.x86_64

      CX7 card info:

      [root@wsfd-advnetlab33 ~]# ethtool -i enp175s0f0np0
      driver: mlx5_core
      version: 5.14.0-284.89.1.el9_2.x86_64
      firmware-version: 28.98.2402 (MT_0000000841)

      This problem appears to only occur when the topo client/DUT (i.e. traffic transmitter) is using a CX7 card for a geneve IPv6 endpoint and happens on both bare metal and VM tests.

      I originally logged this (or a similar) issue for RHEL-8 and 9 in https://bugzilla.redhat.com/show_bug.cgi?id=2108290 so I suspect this issue could be card/driver related.  

      Same issue being tracked for RHEL-10/OVS 3.3: https://issues.redhat.com/browse/FDP-867

      sos report: https://netqe-infra01.knqe.eng.rdu2.dc.redhat.com/sosreports/sosreport-wsfd-advnetlab33-2024-10-12-kzecbqw.tar.xz

      ovs logs are attached to this bug.

      Beaker job: https://beaker.engineering.redhat.com/jobs/10008169

      Link to log showing detailed results using multiple frame sizes and MTUs: https://beaker.engineering.redhat.com/recipes/17207666/tasks/185293300/results/864964102/logs/resultoutputfile.log

      Steps to reproduce:

      Install wireshark and iperf3

      Configuration:

      Server (CX5-EX card as geneve IPv6 endpoint)

      ip link set mtu 1500 dev enp216s0f0np0
      ip addr show enp216s0f0np0
      ip addr add 0/24 dev enp216s0f0np0
      ip addr add 2001:0db8:251::2/64 dev enp216s0f0np0
      ip addr show enp216s0f0np0
      ip link add geneve0 type geneve id 88 remote 2001:0db8:251::1
      ip link set geneve0 up
      ovs-vsctl add-port ovsbr0 geneve0
      sleep 1
      ip link set mtu 1430 dev ovsbr0
      ip link set mtu 1430 dev geneve0
      ovs-vsctl show
      ip addr add 172.31.252.2/24 dev ovsbr0
      ip addr add 2001:0db8:252::2/64 dev ovsbr0
      ip -d link show enp216s0f0np0 
      ip -d link show geneve0
      ip -d link show ovsbr0
      ip addr show enp216s0f0np0
      ip addr show ovsbr0

      Launch iperf3 server

      iperf3 -s &

      Client (DUT using CX7 card as geneve IPv6 endpoint)

      ip link set mtu 1500 dev enp175s0f0np0
      ip addr show enp175s0f0np0
      ip addr add 0/24 dev enp175s0f0np0
      ip addr add 2001:0db8:251::1/64 dev enp175s0f0np0
      ip addr show enp175s0f0np0
      ip link add geneve0 type geneve id 88 remote 2001:0db8:251::2
      ip link set geneve0 up
      ovs-vsctl add-port ovsbr0 geneve0
      sleep 1
      ip link set mtu 1430 dev ovsbr0
      ip link set mtu 1430 dev geneve0
      ovs-vsctl show
      ip addr add 172.31.252.1/24 dev ovsbr0
      ip addr add 2001:0db8:252::1/64 dev ovsbr0
      ip -d link show enp175s0f0np0
      ip -d link show geneve0
      ip -d link show ovsbr0
      ip addr show enp175s0f0np0
      ip addr show ovsbr0

      Test connectitvity:
      ping -c3 172.31.252.2

      Launch tshark capture:

      tshark -i ovsbr0 -f 'src 172.31.252.1 and tcp' -w /home/tshark.pcap &
      sleep 1

      Launch iperf3 client to transmit TCP traffic for 1 second:

      [root@wsfd-advnetlab33 ~]# iperf3 -4 -c 172.31.252.2 -l 64 -t 1
      Connecting to host 172.31.252.2, port 5201
      [  5] local 172.31.252.1 port 41958 connected to 172.31.252.2 port 5201
      21 [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
      [  5]   0.00-1.00   sec   202 KBytes  1.65 Mbits/sec    3   8.67 KBytes       

      • - - - - - - - - - - - - - - - - - - - - - - - -
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-1.00   sec   202 KBytes  1.65 Mbits/sec    3             sender
        [  5]   0.00-1.04   sec  64.0 Bytes   492 bits/sec                  receiver

      iperf Done.

      tshark_retransmits=$(tshark -r /home/tshark.pcap -z 'io,stat,0,frame.len && tcp.analysis.retransmission || tcp.analysis.fast_retransmission || tcp.analysis.spurious_retransmission' | grep -A2 Bytes | tail -1 | awk '{print $6}')

      sender_total_frames=$(tshark -r /home/tshark.pcap -z 'io,stat,0,frame.len && ip.src == 172.31.252.1' | grep -A2 Bytes | tail -1 | awk '{print $6}')

      iperf3 retransmits: 3
      tshark_retransmits=9
      sender_total_frames=37

      iperf3 reported retransmit: 8.11% (3/37) Test sets TCP retransmit threshold at 2%
      tshark reported retransmit: 24.32% (9/37) Test sets TCP retransmit threshold at 2%
      iperf3 reported "goodput":  0.03% (64/202000) Test sets iperf3 goodput threshold at 90%

              ovsdpdk-triage ovsdpdk triage
              ralongi@redhat.com Rick Alongi
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: