Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-13574

Error while creating Pod with SR-IOV CNI plug-in additional network configuration

XMLWordPrintable

    • Moderate
    • No
    • CNF Network Sprint 236, CNF Network Sprint 237, CNF Network Sprint 238
    • 3
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      IHAC who are getting the below error messages while creating Pod with SR-IOV CNI plug-in additional network configuration.

      Warning FailedCreatePodSandBox <invalid> (x52 over 12m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_ans-hairbox-20230331-193250-120_spk-test-automation_37ade0fe-f740-4b56-bffa-6dd4168973c7_0(afbb203c26379555b9aa383436bef3f3f5c26ddd25a469d51889bc6c4d11f479): error adding pod spk-test-automation_ans-hairbox-20230331-193250-120 to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [spk-test-automation/ans-hairbox-20230331-193250-120/37ade0fe-f740-4b56-bffa-6dd4168973c7:w1-ens5f1-mlx5-netdev-180]: error adding container to network "w1-ens5f1-mlx5-netdev-180": SRIOV-NI failed to load netconf: LoadConf(): failed to detect if VF 0000:b1:05.5 has dpdk driver "lstat /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/driver: no such file or directory"
      

      Upon checking, I found out that the /driver directory does not exist on the specified path /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/ and only the below two directories exists:

      /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/link:

      /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/power:

      NIC: Customer is using CX6 DX NIC in the cluster. However, they also mentioned that the issue is primarily seen with CX6. However, on a couple of occasions, I have noticed it on another NIC as well (Intel XXV710 in this case
      [core@worker1 ~]$ lspci -vvv -s 0000:b1:00.1
      b1:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
      Subsystem: Mellanox Technologies Device 0016
      Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
      Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
      Latency: 0
      Interrupt: pin B routed to IRQ 19
      NUMA node: 1
      IOMMU group: 170
      Region 0: Memory at dc000000 (64-bit, prefetchable) [size=32M]
      Expansion ROM at dbd00000 [disabled] [size=1M]
      Capabilities: <access denied>
      Kernel driver in use: mlx5_core
      Kernel modules: mlx5_core

      [core@worker1 ~]$ ethtool -i ens5f1
      driver: mlx5_core
      version: 5.0-0
      firmware-version: 22.29.1016 (MT_0000000359)
      expansion-rom-version:
      bus-info: 0000:b1:00.1
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: yes
      Version-Release number of selected component (if applicable):

      OCP version: 4.10.39

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      Getting the below error messaege :
      ~~~
      error adding container to network "w1-ens5f1-mlx5-netdev-180": SRIOV-NI failed to load netconf: LoadConf(): failed to detect if VF 0000:b1:05.5 has dpdk driver "lstat /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/driver: no such file or directory"
      ~~~

      Expected results:

      Such error messages should not come. 

      Additional info:

       

            apanatto@redhat.com Andrea Panattoni
            rhn-support-mmarkand Mridul Markandey
            Zhanqi Zhao Zhanqi Zhao
            Sebastian Scheinkman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: