Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9328

SNO 4.10 -- E810-XXVDA4 NIC facing issues with DPDK testpmd pods

XMLWordPrintable

    • Important
    • None
    • Rejected
    • Unspecified
    • If docs needed, set a value
    • 7/6: telco priority set same as OCPBUGSM-45727 - asking if it should be closed (CG, BR)

      Description of problem:

      Customer has deployed RHOCP 4.10 SNO and configured SR-IOV. Have enabled hugepages and configured Performance Add-on operator . Created the DPDK POD with RHEL 8.6 and dpdk-21.11-1.el8.x86_64.
      DPDK ports are bound with vfio-pci driver and when I am trying to run testpmd command , seeing the below error :

      [root@dpdk-pod /]# dpdk-testpmd -n 4 -a 0000:10:09.0 -a 0000:10:11.0 --socket-mem=8192 – -i --burst=64 --txd=4096 --rxd=4096 --mbcache=512 --rxq=1 --txq=1 --nb-cores=2 --nb-ports=2 --port-topology=paired --forward-mode=mac --eth-peer=0,04:f4:bc:72:5E:C0 --eth-peer=1,04:f4:bc:72:5E:C1 -a
      EAL: Detected CPU lcores: 64
      EAL: Detected NUMA nodes: 2
      EAL: Detected shared linkage of DPDK
      EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
      EAL: Selected IOVA mode 'VA'
      EAL: No available 2048 kB hugepages reported
      EAL: VFIO support initialized
      EAL: Using IOMMU type 1 (Type 1)
      EAL: Probe PCI driver: net_iavf (8086:1889) device: 0000:10:09.0 (socket 0)
      iavf_init_vf(): VF is still resetting
      iavf_dev_init(): Init vf failed
      EAL: Releasing PCI mapped resource for 0000:10:09.0
      EAL: Calling pci_unmap_resource for 0000:10:09.0 at 0x2200000000
      EAL: Calling pci_unmap_resource for 0000:10:09.0 at 0x2200020000
      EAL: Requested device 0000:10:09.0 cannot be used
      EAL: Using IOMMU type 1 (Type 1)
      EAL: Probe PCI driver: net_iavf (8086:1889) device: 0000:10:11.0 (socket 0)
      iavf_init_vf(): VF is still resetting
      iavf_dev_init(): Init vf failed
      EAL: Releasing PCI mapped resource for 0000:10:11.0
      EAL: Calling pci_unmap_resource for 0000:10:11.0 at 0x2200024000
      EAL: Calling pci_unmap_resource for 0000:10:11.0 at 0x2200044000
      EAL: Requested device 0000:10:11.0 cannot be used
      EAL: Bus (pci) probe failed.
      TELEMETRY: No legacy callbacks, legacy socket not created
      testpmd: No probed ethernet devices
      Interactive-mode selected
      Fail: input rxq (1) can't be greater than max_rx_queues (0) of port 0
      EAL: Error - exiting with code: 1
      Cause: rxq 1 invalid - must be >= 0 && <= 0

      Suspecting sriov-cni or sriov-network-operator's config daemon to be the most likely culprit of setting the VF MAC to zero

      +++++++++++++++++++++++++++++
      [imm@supportshell-1 03232288]$ cat 0090-dmesg.log | grep -i "enabling device"
      [ 858.244315] iavf 0000:10:09.0: enabling device (0000 -> 0002)
      [ 863.282517] iavf 0000:10:11.0: enabling device (0000 -> 0002)
      [ 2264.288978] vfio-pci 0000:10:09.0: enabling device (0000 -> 0002)
      [ 2266.146441] vfio-pci 0000:10:11.0: enabling device (0000 -> 0002)
      [supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [06:18:09+0000]
      [imm@supportshell-1 03232288]$ cat 0090-dmesg.log | grep -i "Invalid MAC"
      [ 859.367993] iavf 0000:10:09.0: Invalid MAC address 00:00:00:00:00:00, using random <------
      [ 865.511858] iavf 0000:10:11.0: Invalid MAC address 00:00:00:00:00:00, using random <------
      [supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [06:18:42+0000]

      [imm@supportshell-1 03232288]$ cat 0090-dmesg.log | grep -A8 "Device is still in reset"
      [ 858.293697] iavf 0000:10:09.0: Device is still in reset (-16), retrying
      [ 859.367993] iavf 0000:10:09.0: Invalid MAC address 00:00:00:00:00:00, using random
      [ 860.150581] iavf 0000:10:09.0: Multiqueue Enabled: Queue pair count = 16
      [ 860.150964] iavf 0000:10:09.0: MAC address: fe:a3:8a:e3:87:38
      [ 860.150964] iavf 0000:10:09.0: GRO is enabled
      [ 860.153474] iavf 0000:10:09.0 ens1f1v0: renamed from eth0
      [ 861.151261] ice 0000:10:00.1 ens1f1: Setting MAC fe:a3:8a:e3:87:38 on VF 0. VF driver will be reinitialized
      [ 861.161115] iavf 0000:10:09.0: Reset warning received from the PF
      [ 861.167271] iavf 0000:10:09.0: Scheduling reset task

      [ 863.332396] iavf 0000:10:11.0: Device is still in reset (-16), retrying
      [ 863.517457] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
      [ 863.881948] IPv6: ADDRCONF(NETDEV_UP): vethccf36940: link is not ready
      [ 863.888591] IPv6: ADDRCONF(NETDEV_CHANGE): vethccf36940: link becomes ready
      [ 863.895750] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
      [ 863.952556] c31e919454b4977: renamed from vethccf36940
      [ 864.261658] device c31e919454b4977 entered promiscuous mode
      [ 864.405362] iavf 0000:10:11.0: Device is still in reset (-16), retrying
      [ 865.511858] iavf 0000:10:11.0: Invalid MAC address 00:00:00:00:00:00, using random
      [ 866.403872] iavf 0000:10:11.0: Multiqueue Enabled: Queue pair count = 16
      [ 866.411632] iavf 0000:10:11.0: MAC address: 22:c9:01:eb:06:18
      [ 866.417433] iavf 0000:10:11.0: GRO is enabled
      [ 866.432134] iavf 0000:10:11.0 ens1f2v0: renamed from eth0
      [ 867.404508] ice 0000:10:00.2 ens1f2: Setting MAC 22:c9:01:eb:06:18 on VF 0. VF driver will be reinitialized
      [ 867.414386] iavf 0000:10:11.0: Reset warning received from the PF
      [ 867.420544] iavf 0000:10:11.0: Scheduling reset task
      [supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [06:27:37+0000

      imm@supportshell-1 03232288]$ cat 0090-dmesg.log | grep -i "error"
      [ 7.237590] ERST: Error Record Serialization Table (ERST) support is initialized.
      [ 8.562726] i8042: probe of i8042 failed with error -5
      [supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [05:27:36+0000]
      [imm@supportshell-1 03232288]$ cat 0090-dmesg.log | grep -i "fail"
      [ 8.562726] i8042: probe of i8042 failed with error -5
      [ 54.123443] ice 0000:c1:00.0: Query Port ETS failed
      [ 54.137593] ice 0000:c1:00.0: Failed to set local DCB config -100
      [ 54.152997] ice 0000:c1:00.0: DCB init failed
      [ 62.011983] ice 0000:c1:00.1: Query Port ETS failed
      [ 62.025807] ice 0000:c1:00.1: Failed to set local DCB config -100
      [ 62.040916] ice 0000:c1:00.1: DCB init failed
      [ 69.936658] ice 0000:c1:00.2: Query Port ETS failed
      [ 69.951153] ice 0000:c1:00.2: Failed to set local DCB config -100
      [ 69.966910] ice 0000:c1:00.2: DCB init failed
      [ 77.878915] ice 0000:c1:00.3: Query Port ETS failed
      [ 77.892150] ice 0000:c1:00.3: Failed to set local DCB config -100
      [ 77.906496] ice 0000:c1:00.3: DCB init failed
      [ 87.955084] Failed to associated timeout policy `ovs_test_tp'
      [ 863.096331] ice 0000:10:00.1: VF 0 failed opcode 7, retval: -5
      [supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [05:29:49+0000]
      [imm@supportshell-1 03232288]$ cat 0090-dmesg.log | grep -i "warn"
      [ 78.631927] iscsid[1368]: iscsid: Warning: InitiatorName file /etc/iscsi/initiatorname.iscsi does not exist or does not contain a properly formatted InitiatorName. If using software iscsi (iscsi_tcp or ib_iser) or partial offload (bnx2i or cxgbi iscsi), you may not be able to log into or discover targets. Please create a file /etc/iscsi/initiatorname.iscsi that contains a sting with the format: InitiatorName=iqn.yyyy-mm.<reversed domain name>[:identifier].
      [ 861.161115] iavf 0000:10:09.0: Reset warning received from the PF <-----
      [ 867.414386] iavf 0000:10:11.0: Reset warning received from the PF <-----

      Version-Release number of selected component (if applicable):
      DPDK on SNO v4.10

      How reproducible:

      Steps to Reproduce:
      1.
      2.
      3.

      Actual results:
      The customer is seeing these errors
      iavf_init_vf(): VF is still resetting
      iavf_dev_init(): Init vf failed

      Expected results:
      The DPDK pod should utilize the VF.

      Additional info:
      We suspect this could be a bug with drivers of the NIC as it's newly introduced to CoreOS.

              sscheink@redhat.com Sebastian Scheinkman
              rhn-support-adubey Akash Dubey
              Zhanqi Zhao Zhanqi Zhao
              Red Hat Employee
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: