Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.19
Component/s: Networking / SR-IOV
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Critical
Regression:
Yes

Target Backport Versions:
None
Target Version:
None
Release Blocker:
Rejected
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

  On 4.19 nightly, enable ovs hwoffload on worker node with nics cx-5/BF2, when testing pod-pod traffic, if the pods are in different nodes, traffic is not offloaded. Capture on the VF representor, it can capture plenty of packets, and traffic bandwidth is much lower than before (e.g 4.18, 4.17).

[root@openshift-qe-026 offload_test]# oc get pods -o wide
NAME                                              READY   STATUS    RESTARTS   AGE     IP               NODE                                       NOMINATED NODE   READINESS GATES
iperf-rc-t7wxr                                    1/1     Running   0          4m45s   10.130.2.24      openshift-qe-029.lab.eng.rdu2.redhat.com   <none>           <none>
iperf-rc-tbbl6                                    1/1     Running   0          4m45s   10.131.2.9       openshift-qe-025.lab.eng.rdu2.redhat.com   <none>           <none>
iperf-server                                      1/1     Running   0          5m1s    10.131.2.8       openshift-qe-025.lab.eng.rdu2.redhat.com   <none>           <none>
openshift-qe-025labengrdu2redhatcom-debug-hl7k2   1/1     Running   0          4m13s   192.168.111.41   openshift-qe-025.lab.eng.rdu2.redhat.com   <none>           <none>

[root@openshift-qe-026 offload_test]# oc rsh iperf-rc-t7wxr iperf3 -c 10.131.2.8 -i 1 -t 20
Connecting to host 10.131.2.8, port 5201
[  4] local 10.130.2.24 port 37242 connected to 10.131.2.8 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  1.13 GBytes  9.67 Gbits/sec  107    939 KBytes       
[  4]   1.00-2.00   sec  1.13 GBytes  9.73 Gbits/sec    8   1.06 MBytes       
[  4]   2.00-3.00   sec  1.16 GBytes  9.94 Gbits/sec   76    906 KBytes       
[  4]   3.00-4.00   sec  1.18 GBytes  10.1 Gbits/sec   24   1.08 MBytes       
[  4]   4.00-5.00   sec  1.12 GBytes  9.64 Gbits/sec   28    948 KBytes       
[  4]   5.00-6.00   sec  1.16 GBytes  9.98 Gbits/sec   24   1.13 MBytes       
[  4]   6.00-7.00   sec  1.16 GBytes  10.0 Gbits/sec   51    965 KBytes       
[  4]   7.00-8.00   sec   982 MBytes  8.24 Gbits/sec   42    586 KBytes       
[  4]   8.00-9.00   sec  1.13 GBytes  9.72 Gbits/sec   23   1.10 MBytes       
[  4]   9.00-10.00  sec  1.12 GBytes  9.59 Gbits/sec   66    877 KBytes       
[  4]  10.00-11.00  sec  1.05 GBytes  9.06 Gbits/sec   33   1019 KBytes       
[  4]  11.00-12.00  sec  1.12 GBytes  9.60 Gbits/sec   58    820 KBytes       
[  4]  12.00-13.00  sec  1.17 GBytes  10.0 Gbits/sec   31    983 KBytes       
[  4]  13.00-14.00  sec  1.18 GBytes  10.1 Gbits/sec   37   1.10 MBytes       
[  4]  14.00-15.00  sec  1.15 GBytes  9.92 Gbits/sec   47    943 KBytes       
[  4]  15.00-16.00  sec  1.13 GBytes  9.68 Gbits/sec   11   1.07 MBytes       
[  4]  16.00-17.00  sec  1.14 GBytes  9.80 Gbits/sec   33    875 KBytes       
[  4]  17.00-18.00  sec  1.15 GBytes  9.85 Gbits/sec   22   1.02 MBytes       
[  4]  18.00-19.00  sec  1.17 GBytes  10.1 Gbits/sec   15   1.11 MBytes       
[  4]  19.00-20.00  sec  1.16 GBytes  10.0 Gbits/sec   61    944 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-20.00  sec  22.7 GBytes  9.74 Gbits/sec  797             sender
[  4]   0.00-20.00  sec  22.7 GBytes  9.74 Gbits/sec                  receiver


iperf Done.

Version-Release number of selected component (if applicable):

    4.19

How reproducible:

always

Steps to Reproduce:

 1. install sriov-operator and enable ovs hwoffload
2. create sriovnetworknodepolicy and net-def-attch
# cat sriovoffloadpolicy.yaml apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata:   name: sriovoffloadpolicy   namespace: openshift-sriov-network-operator spec:   deviceType: netdevice   eSwitchMode: "switchdev"   nicSelector:     vendor: "15b3"     pfNames:     - ens3f0np0   nodeSelector:     feature.node.kubernetes.io/sriov-capable: "true"   numVfs: 3   priority: 5   resourceName: sriovoffloadpolicy

# cat net-attach-def.yaml apiVersion: "k8s.cni.cncf.io/v1" kind: NetworkAttachmentDefinition metadata:   name: default   namespace: offload-testing   annotations:     k8s.v1.cni.cncf.io/resourceName: openshift.io/sriovoffloadpolicy spec:   config: '{"cniVersion":"0.3.1","name":"ovn-kubernetes","type":"ovn-k8s-cni-overlay","ipam":{},"dns":{}}'

3. create iperf pods on worker nodes
# cat iperf-server.json {   "kind": "Pod",   "apiVersion":"v1",   "metadata": {         "name": "iperf-server",         "annotations": {     "v1.multus-cni.io/default-network": "offload-testing/default"         }   },   "spec": {       "containers": [{         "name": "iperf-server",         "image": "quay.io/openshifttest/iperf3@sha256:440c59251338e9fcf0a00d822878862038d3b2e2403c67c940c7781297953614",     "command": [       "iperf3"     ],     "args":[       "-s"     ]       }]   } } # cat iperf-rc.json {     "apiVersion": "v1",     "kind": "ReplicationController",     "metadata": {         "labels": {             "name": "iperf-rc"         },         "name": "iperf-rc"     },     "spec": {         "replicas": 2,         "template": {             "metadata": {                 "labels": {                     "name": "iperf-pods"                 },                 "annotations": {                     "v1.multus-cni.io/default-network": "offload-testing/default"                }             },             "spec": {                 "containers": [                     {                         "image": "quay.io/openshifttest/iperf3@sha256:440c59251338e9fcf0a00d822878862038d3b2e2403c67c940c7781297953614",                         "name": "iperf-client",                         "imagePullPolicy": "IfNotPresent",                         "resources":{                           "limits":{                             "memory":"340Mi"                           }                         }                     }                 ]             }         }     } }

4. find the vf rep name of iperf-server
# ovs-vsctl --columns=name find interface external_ids:iface-id=offload-testing_iperf-server name                : ens3f0np0_2

5. send traffic from iperf client pod to iperf server pod which on differnet nodes
oc rsh iperf-rc-t7wxr iperf3 -c 10.131.2.8 -i 1 -t 20

6. capture packets on iperf server vf rep
tcpdump -i ens3f0np0_2 -vvv

Actual results:

    capture too many packets on VF rep and bandwidth is small

Expected results:

    should only capture a few packets on VF and bandwidth should be improved

Additional info:

Assignee:: Thomas Haller

Reporter:: Ying Wang

Need Info From:: None

Contributors:: None

QA Contact:: Zhiqiang Fang

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2025/05/09 10:33 AM

Updated:: 2025/07/13 1:19 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide