Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-1129

[hwol][mlx5_core]reply traffic over tunnel won't be offloaded

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • rhel-10
    • openvswitch3.5
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • rhel-10
    • rhel-sst-network-fastdatapath
    • ssg_networking

       Problem Description: Clearly explain the issue.

      On DUT site, build ipv4 vxlan tunnel over ovs bridge. Peer site use a kernel vxlan tunnel to receive and reply packets. We can capture reply packets on vf_rep when enable hw-offload.

       Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).

      offload function was broken.

       Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).

      openvswitch-selinux-extra-policy-1.0-34.el10fdp.noarch
      openvswitch3.5-3.5.0-0.15.el10fdp.x86_64

      6.12.0-46.el10.x86_64

        Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).

      new issue for ovs3.5 on rhel10

       Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.

      100%

       Reproduction Steps: Provide detailed steps or scripts to replicate the issue.

      On DUT run below script

       

      #!/bin/bash
      # setup eswitch
      nic_test=myeth_1
      nic_vf1=enp130s0f0v0
      nic_rep0=eth0
      pci_nic="0000:82:00.0"
      pci_vf="0000:82:00.2"
      
      # install scapy
      wget http://netqe-bj.usersys.redhat.com/share/tools/scapy-v2.5.0.zip
      unzip scapy-v2.5.0.zip
      pushd ./scapy-2.5.0
      python setup.py install
      popd
      
      # cleanup
      echo 0 > /sys/bus/pci/devices/$pci_nic/sriov_numvfs
      sleep 1
      ethtool -K ${nic_test} hw-tc-offload on
      devlink dev eswitch set pci/${pci_nic} mode legacy
      devlink dev param set pci/${pci_nic} name flow_steering_mode value smfs cmode runtime
      # set VF mac address
      ip link set $nic_test vf 0 mac 00:de:ad:01:01:01
      # set to switchdev mode
      echo 1 > /sys/bus/pci/devices/$pci_nic/sriov_numvfs
      echo $pci_vf > /sys/bus/pci/drivers/mlx5_core/unbind
      devlink dev eswitch set pci/$pci_nic mode switchdev
      echo $pci_vf > /sys/bus/pci/drivers/mlx5_core/bind
      ip netns add ns1
      ip link set $nic_vf1 netns ns1
      ip netns exec ns1 ip link set $nic_vf1 up
      ip netns exec ns1 ip link set mtu $((1500-50)) dev $nic_vf1
      ip netns exec ns1 ip addr add 192.168.1.1/24 dev $nic_vf1
      sleep 10
      
      ip add add 10.0.100.1/24 dev ${nic_test}
      # create ovsbr0
      ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
      ovs-vsctl add-br ovsbr0
      ovs-vsctl add-port ovsbr0 $nic_rep0
      ovs-vsctl add-port ovsbr0 vxlan100 -- set interface vxlan100 type=vxlan options:key=100 options:remote_ip=10.0.100.2
      
      ip netns exec ns1 ping 192.168.1.2
      
      tc -s filter show dev eth0 ingress
      tc -s filter show dev eth0 egress
      tc -s filter show dev myeth_1 ingress
      tc -s filter show dev myeth_1 egress
      ovs-appctl dpctl/dump-flows -m --names
      timeout 10 tcpdump -nnev -i eth0 -c10 "icmp" 

      On peer site, run below script

       

       

      ip link set ens2f0 up
      nmcli dev set ens2f0 managed no
      sysctl -w net.ipv6.conf.ens2f0.router_solicitations=0
      ip addr flush ens2f0
      ip link set ens2f0 up
      ip link set ens2f0 mtu 9120
      ip addr add 10.0.100.2/24 dev ens2f0
      
      ip link add name vxlan0 type vxlan id 100 local 10.0.100.2 remote 10.0.100.1 dev ens2f0 dstport 4789
      nmcli dev set vxlan0 managed no
      sysctl -w net.ipv6.conf.vxlan0.router_solicitations=0
      ip addr flush vxlan0
      ip link set vxlan0 up
      ip link set vxlan0 mtu 9120
      ip addr add 192.168.1.2/24 dev vxlan0 

       

       

       Expected Behavior: Describe what should happen under normal circumstances.

      On DUT site,using tcpdump won't capture icmp packets because all packets will be offloaded. 

       Observed Behavior: Explain what actually happens.

      capture the reply packets

       

      [root@dell-per730-51 ~]#  timeout 10 tcpdump -nnev -i eth0 -c10 "icmp"
      dropped privs to tcpdump
      tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
      03:57:32.702847 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 61906, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 337, length 64
      03:57:33.703526 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 62480, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 338, length 64
      03:57:34.749921 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 63230, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 339, length 64
      03:57:35.773860 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 63909, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 340, length 64
      03:57:36.797900 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 64109, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 341, length 64
      03:57:37.821884 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 64299, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 342, length 64
      03:57:38.845866 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 65202, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 343, length 64
      03:57:39.869884 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 65424, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 344, length 64
      03:57:40.893902 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 696, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 345, length 64
      03:57:41.917880 22:54:fa:fe:43:30 > 00:de:ad:01:01:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 752, offset 0, flags [none], proto ICMP (1), length 84)
          192.168.1.2 > 192.168.1.1: ICMP echo reply, id 2547, seq 346, length 64
      10 packets captured
      10 packets received by filter
      0 packets dropped by kernel
       

       

       

       Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.

       

      [root@dell-per730-51 ~]# ovs-appctl dpctl/dump-flows -m --names
      ufid:40c0a7a5-1747-4a8d-a5e4-ffcb4d8ba6a7, recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(eth0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0/0,id=0/0),eth(src=00:de:ad:01:01:01,dst=22:54:fa:fe:43:30),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0x3,ttl=0/0,frag=no), packets:274, bytes:26852, used:0.090s, offloaded:yes, dp:tc, actions:set(tunnel(tun_id=0x64,dst=10.0.100.2,ttl=64,tp_dst=4789,flags(df|key))),vxlan_sys_4789
      ufid:82367acd-c178-428e-b364-68f6975f3aba, recirc_id(0),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x64,src=10.0.100.2,dst=10.0.100.1,ttl=0/0,tp_dst=4789,flags(-df+csum+key)),in_port(vxlan_sys_4789),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),packet_type(ns=0/0,id=0/0),eth(src=22:54:fa:fe:43:30,dst=00:de:ad:01:01:01),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:274, bytes:23016, used:0.470s, dp:tc, actions:eth0
      
       [root@dell-per730-51 ~]# tc -s filter show dev vxlan_sys_4789 ingress
      filter protocol ip pref 2 flower chain 0 
      filter protocol ip pref 2 flower chain 0 handle 0x1 
        dst_mac 00:de:ad:01:01:01
        src_mac 22:54:fa:fe:43:30
        eth_type ipv4
        enc_dst_ip 10.0.100.1
        enc_src_ip 10.0.100.2
        enc_key_id 100
        enc_dst_port 4789
        enc_tos 0
        ip_flags nofrag
        not_in_hw
       action order 1: tunnel_key  unset pipe
        index 4 ref 1 bind 1 installed 1109 sec used 0 sec firstused 1108 sec
       Action statistics:
       Sent 91056 bytes 1084 pkt (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 0p requeues 0
       no_percpu
       action order 2: mirred (Egress Redirect to device eth0) stolen
       index 5 ref 1 bind 1 installed 1109 sec used 0 sec firstused 1108 sec
       Action statistics:
       Sent 91056 bytes 1084 pkt (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 0p requeues 0
       cookie cd7a36828e4278c1f66864b3ba3a5f97
       no_percpu
      filter protocol arp pref 5 flower chain 0 
      filter protocol arp pref 5 flower chain 0 handle 0x1 
        dst_mac 00:de:ad:01:01:01
        src_mac 22:54:fa:fe:43:30
        eth_type arp
        enc_dst_ip 10.0.100.1
        enc_src_ip 10.0.100.2
        enc_key_id 100
        enc_dst_port 4789
        enc_tos 0
        not_in_hw
       action order 1: tunnel_key  unset pipe
        index 1 ref 1 bind 1 installed 1 sec used 1 sec
       Action statistics:
       Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 0p requeues 0
       no_percpu
       action order 2: mirred (Egress Redirect to device eth0) stolen
       index 1 ref 1 bind 1 installed 1 sec used 1 sec
       Action statistics:
       Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
       backlog 0b 0p requeues 0
       cookie 18d1309c8b44228f9789b782f81bf582
       no_percpu

       

       

       Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)

      dmesg log

      [ +34.024078] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 03:59] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +15.128749] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +11.495074] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:00] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +37.887731] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:01] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +12.520344] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:02] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +19.688171] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +18.199522] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:03] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +42.775455] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:04] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +12.520000] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:05] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +22.760216] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +12.055504] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:06] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [ +36.631542] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attributes
      [Feb 3 04:07] mlx5_core 0000:82:00.0 myeth_1: Failed to parse tunnel attribute 

       

              ovsdpdk-triage ovsdpdk triage
              mhou@redhat.com Minxi Hou
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: