-
Bug
-
Resolution: Done-Errata
-
Undefined
-
None
-
4.16
-
None
-
No
-
CNF Network Sprint 251
-
1
-
False
-
Description of problem:
After istalliing the SR-IOV Network Operator, if a node has a Mellanox SR-IOV NIC, the SriovNetworkNodeState goes to Failed status. sriov-netowkr-config-daemon logs contains the error: 2024-03-13T12:41:07.158337613Z INFO daemon/daemon.go:546 mellanox plugin OnNodeStateChange() 2024-03-13T12:41:07.158367088Z INFO kernel/kernel.go:582 RunCommand() {"command": "/bin/sh", "args": ["-c", "cat", "/host/sys/kernel/security/lockdown"]} 2024-03-13T12:41:07.164472541Z LEVEL(-2) kernel/kernel.go:582 RunCommand() {"output": "", "error": null} 2024-03-13T12:41:07.164512706Z LEVEL(-2) mellanox/mellanox_plugin.go:82 IsKernelLockdownMode() {"output": "", "error": null} 2024-03-13T12:41:07.164585388Z INFO mellanox/mellanox_plugin.go:156 mellanox-plugin getMlnxNicFwData() {"device": "0000:d8:00.0"} 2024-03-13T12:41:07.164611158Z INFO mellanox/mellanox.go:180 MstConfigReadData() {"device": "0000:d8:00.0"} 2024-03-13T12:41:07.164626797Z INFO mellanox/mellanox.go:78 RunCommand() {"command": "mstconfig", "args": ["-e", "-d", "0000:d8:00.0", "q"]} 2024-03-13T12:41:07.177660098Z LEVEL(-2) mellanox/mellanox.go:78 RunCommand() {"output": "-E- Failed to open the device\n", "error": "exit status 3"}
How reproducible:
100%
Steps to Reproduce:
1. Provision a 4.16 cluster with at least one node with Mellanox NICs and SecureBoot enabled 2. Deploy the SR-IOV network operator 3. run `oc get -n openshift-sriov-network-operator sriovnetworknodestate`
Actual results:
worker has SYNC STATUS == Failed
Expected results:
all workers have SYNC STATUS == Succeeded
Additional info:
Mellanox NICs need SecureBoot disabled to work correctly, but the operator should raise an error if and only if the user tries to configure the NIC. https://access.redhat.com/solutions/6643261
- links to
-
RHEA-2024:0040 OpenShift Container Platform 4.16.z extras update