-
Bug
-
Resolution: Cannot Reproduce
-
Normal
-
None
-
4.10.z
-
Moderate
-
No
-
CNF Network Sprint 236, CNF Network Sprint 237, CNF Network Sprint 238
-
3
-
False
-
Description of problem:
IHAC who are getting the below error messages while creating Pod with SR-IOV CNI plug-in additional network configuration.
Warning FailedCreatePodSandBox <invalid> (x52 over 12m) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_ans-hairbox-20230331-193250-120_spk-test-automation_37ade0fe-f740-4b56-bffa-6dd4168973c7_0(afbb203c26379555b9aa383436bef3f3f5c26ddd25a469d51889bc6c4d11f479): error adding pod spk-test-automation_ans-hairbox-20230331-193250-120 to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [spk-test-automation/ans-hairbox-20230331-193250-120/37ade0fe-f740-4b56-bffa-6dd4168973c7:w1-ens5f1-mlx5-netdev-180]: error adding container to network "w1-ens5f1-mlx5-netdev-180": SRIOV-NI failed to load netconf: LoadConf(): failed to detect if VF 0000:b1:05.5 has dpdk driver "lstat /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/driver: no such file or directory"
Upon checking, I found out that the /driver directory does not exist on the specified path /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/ and only the below two directories exists:
/sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/link:
/sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/power:
NIC: Customer is using CX6 DX NIC in the cluster. However, they also mentioned that the issue is primarily seen with CX6. However, on a couple of occasions, I have noticed it on another NIC as well (Intel XXV710 in this case
[core@worker1 ~]$ lspci -vvv -s 0000:b1:00.1
b1:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
Subsystem: Mellanox Technologies Device 0016
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 19
NUMA node: 1
IOMMU group: 170
Region 0: Memory at dc000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at dbd00000 [disabled] [size=1M]
Capabilities: <access denied>
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core
[core@worker1 ~]$ ethtool -i ens5f1
driver: mlx5_core
version: 5.0-0
firmware-version: 22.29.1016 (MT_0000000359)
expansion-rom-version:
bus-info: 0000:b1:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
Version-Release number of selected component (if applicable):
OCP version: 4.10.39
How reproducible:
Steps to Reproduce:
1. 2. 3.
Actual results:
Getting the below error messaege : ~~~ error adding container to network "w1-ens5f1-mlx5-netdev-180": SRIOV-NI failed to load netconf: LoadConf(): failed to detect if VF 0000:b1:05.5 has dpdk driver "lstat /sys/devices/pci0000:b0/0000:b0:02.0/0000:b1:05.5/driver: no such file or directory" ~~~
Expected results:
Such error messages should not come.
Additional info: