-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.19
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
Yes
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
On 4.19 nightly, enable ovs hwoffload on worker node with nics cx-5/BF2, when testing pod-pod traffic, if the pods are in different nodes, traffic is not offloaded. Capture on the VF representor, it can capture plenty of packets, and traffic bandwidth is much lower than before (e.g 4.18, 4.17). [root@openshift-qe-026 offload_test]# oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES iperf-rc-t7wxr 1/1 Running 0 4m45s 10.130.2.24 openshift-qe-029.lab.eng.rdu2.redhat.com <none> <none> iperf-rc-tbbl6 1/1 Running 0 4m45s 10.131.2.9 openshift-qe-025.lab.eng.rdu2.redhat.com <none> <none> iperf-server 1/1 Running 0 5m1s 10.131.2.8 openshift-qe-025.lab.eng.rdu2.redhat.com <none> <none> openshift-qe-025labengrdu2redhatcom-debug-hl7k2 1/1 Running 0 4m13s 192.168.111.41 openshift-qe-025.lab.eng.rdu2.redhat.com <none> <none> [root@openshift-qe-026 offload_test]# oc rsh iperf-rc-t7wxr iperf3 -c 10.131.2.8 -i 1 -t 20 Connecting to host 10.131.2.8, port 5201 [ 4] local 10.130.2.24 port 37242 connected to 10.131.2.8 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.13 GBytes 9.67 Gbits/sec 107 939 KBytes [ 4] 1.00-2.00 sec 1.13 GBytes 9.73 Gbits/sec 8 1.06 MBytes [ 4] 2.00-3.00 sec 1.16 GBytes 9.94 Gbits/sec 76 906 KBytes [ 4] 3.00-4.00 sec 1.18 GBytes 10.1 Gbits/sec 24 1.08 MBytes [ 4] 4.00-5.00 sec 1.12 GBytes 9.64 Gbits/sec 28 948 KBytes [ 4] 5.00-6.00 sec 1.16 GBytes 9.98 Gbits/sec 24 1.13 MBytes [ 4] 6.00-7.00 sec 1.16 GBytes 10.0 Gbits/sec 51 965 KBytes [ 4] 7.00-8.00 sec 982 MBytes 8.24 Gbits/sec 42 586 KBytes [ 4] 8.00-9.00 sec 1.13 GBytes 9.72 Gbits/sec 23 1.10 MBytes [ 4] 9.00-10.00 sec 1.12 GBytes 9.59 Gbits/sec 66 877 KBytes [ 4] 10.00-11.00 sec 1.05 GBytes 9.06 Gbits/sec 33 1019 KBytes [ 4] 11.00-12.00 sec 1.12 GBytes 9.60 Gbits/sec 58 820 KBytes [ 4] 12.00-13.00 sec 1.17 GBytes 10.0 Gbits/sec 31 983 KBytes [ 4] 13.00-14.00 sec 1.18 GBytes 10.1 Gbits/sec 37 1.10 MBytes [ 4] 14.00-15.00 sec 1.15 GBytes 9.92 Gbits/sec 47 943 KBytes [ 4] 15.00-16.00 sec 1.13 GBytes 9.68 Gbits/sec 11 1.07 MBytes [ 4] 16.00-17.00 sec 1.14 GBytes 9.80 Gbits/sec 33 875 KBytes [ 4] 17.00-18.00 sec 1.15 GBytes 9.85 Gbits/sec 22 1.02 MBytes [ 4] 18.00-19.00 sec 1.17 GBytes 10.1 Gbits/sec 15 1.11 MBytes [ 4] 19.00-20.00 sec 1.16 GBytes 10.0 Gbits/sec 61 944 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-20.00 sec 22.7 GBytes 9.74 Gbits/sec 797 sender [ 4] 0.00-20.00 sec 22.7 GBytes 9.74 Gbits/sec receiver iperf Done.
Version-Release number of selected component (if applicable):
4.19
How reproducible:
always
Steps to Reproduce:
1. install sriov-operator and enable ovs hwoffload 2. create sriovnetworknodepolicy and net-def-attch # cat sriovoffloadpolicy.yaml apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: sriovoffloadpolicy namespace: openshift-sriov-network-operator spec: deviceType: netdevice eSwitchMode: "switchdev" nicSelector: vendor: "15b3" pfNames: - ens3f0np0 nodeSelector: feature.node.kubernetes.io/sriov-capable: "true" numVfs: 3 priority: 5 resourceName: sriovoffloadpolicy # cat net-attach-def.yaml apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata: name: default namespace: offload-testing annotations: k8s.v1.cni.cncf.io/resourceName: openshift.io/sriovoffloadpolicy spec: config: '{"cniVersion":"0.3.1","name":"ovn-kubernetes","type":"ovn-k8s-cni-overlay","ipam":{},"dns":{}}' 3. create iperf pods on worker nodes # cat iperf-server.json { "kind": "Pod", "apiVersion":"v1", "metadata": { "name": "iperf-server", "annotations": { "v1.multus-cni.io/default-network": "offload-testing/default" } }, "spec": { "containers": [{ "name": "iperf-server", "image": "quay.io/openshifttest/iperf3@sha256:440c59251338e9fcf0a00d822878862038d3b2e2403c67c940c7781297953614", "command": [ "iperf3" ], "args":[ "-s" ] }] } } # cat iperf-rc.json { "apiVersion": "v1", "kind": "ReplicationController", "metadata": { "labels": { "name": "iperf-rc" }, "name": "iperf-rc" }, "spec": { "replicas": 2, "template": { "metadata": { "labels": { "name": "iperf-pods" }, "annotations": { "v1.multus-cni.io/default-network": "offload-testing/default" } }, "spec": { "containers": [ { "image": "quay.io/openshifttest/iperf3@sha256:440c59251338e9fcf0a00d822878862038d3b2e2403c67c940c7781297953614", "name": "iperf-client", "imagePullPolicy": "IfNotPresent", "resources":{ "limits":{ "memory":"340Mi" } } } ] } } } } 4. find the vf rep name of iperf-server # ovs-vsctl --columns=name find interface external_ids:iface-id=offload-testing_iperf-server name : ens3f0np0_2 5. send traffic from iperf client pod to iperf server pod which on differnet nodes oc rsh iperf-rc-t7wxr iperf3 -c 10.131.2.8 -i 1 -t 20 6. capture packets on iperf server vf rep tcpdump -i ens3f0np0_2 -vvv
Actual results:
capture too many packets on VF rep and bandwidth is small
Expected results:
should only capture a few packets on VF and bandwidth should be improved
Additional info: