-
Bug
-
Resolution: Done-Errata
-
Major
-
None
-
4.13.z
-
None
-
No
-
3
-
NHE Sprint 239, NHE Sprint 240
-
2
-
False
-
Description of problem:
In the event a user has multiple devices when attempting to switch a bf2 to nic mode, bf2-switch-mode will return terminate with exit code 120 without applying changes. This will generate the misleading error message "Device is not a Bluefield2"
Version-Release number of selected component (if applicable):
How reproducible:
Everytime
Steps to Reproduce:
1. Create cluster with multiple sriov capable devices 2. Attempt to switch BF2 to NIC mode using /bindata/scripts/bf2-switch-mode.sh 3.
Actual results:
May 31 01:59:05 helix10.lab.eng.tlv2.redhat.com bash[8162]: Found device: 0000:5e:00.0 May 31 01:59:05 helix10.lab.eng.tlv2.redhat.com bash[8162]: 0000:d8:00.0 May 31 01:59:05 helix10.lab.eng.tlv2.redhat.com bash[8162]: Device is not a Bluefield2 May 31 01:59:05 helix10.lab.eng.tlv2.redhat.com bash[8162]: time="2023-05-31T01:59:05Z" level=fatal msg="execing command in container: command terminated with exit code 120" May 31 01:59:05 helix10.lab.eng.tlv2.redhat.com sudo[7656]: pam_unix(sudo:session): session closed for user root May 31 01:59:05 helix10.lab.eng.tlv2.redhat.com systemd[1]: Finished Switch BlueField2 card to NIC/DPU mode.
Expected results:
Jun 05 09:44:59 worker-226 bash[7668]: Switching to NIC mode. Jun 05 09:44:59 worker-226 kubenswrapper[5229]: I0605 09:44:59.303114 5229 kubelet.go:2457] "SyncLoop (PLEG): event for pod" pod="openshift-multus> Jun 05 09:44:59 worker-226 bash[7668]: Device #1: Jun 05 09:44:59 worker-226 bash[7668]: ---------- Jun 05 09:44:59 worker-226 bash[7668]: Device type: BlueField2 Jun 05 09:44:59 worker-226 bash[7668]: Name: MBF2H332A-AEEO_Ax_Bx Jun 05 09:44:59 worker-226 bash[7668]: Description: BlueField-2 P-Series DPU 25GbE Dual-Port SFP56; PCIe Gen4 x8; Crypto Enabled; 16GB on-board DD> Jun 05 09:44:59 worker-226 bash[7668]: Device: 0000:ca:00.0 Jun 05 09:44:59 worker-226 bash[7668]: Configurations: Next Boot New Jun 05 09:44:59 worker-226 bash[7668]: INTERNAL_CPU_MODEL EMBEDDED_CPU(1) EMBEDDED_CPU(1) Jun 05 09:44:59 worker-226 bash[7668]: INTERNAL_CPU_PAGE_SUPPLIER ECPF(0) EXT_HOST_PF(1) Jun 05 09:44:59 worker-226 bash[7668]: INTERNAL_CPU_ESWITCH_MANAGER ECPF(0) EXT_HOST_PF(1) Jun 05 09:44:59 worker-226 bash[7668]: INTERNAL_CPU_IB_VPORT0 ECPF(0) EXT_HOST_PF(1) Jun 05 09:44:59 worker-226 bash[7668]: INTERNAL_CPU_OFFLOAD_ENGINE ENABLED(0) DISABLED(1) Jun 05 09:44:59 worker-226 bash[7668]: Apply new Configuration? (y/n) [n] : y Jun 05 09:44:59 worker-226 bash[7668]: Applying... Done! Jun 05 09:44:59 worker-226 bash[7668]: -I- Please reboot machine to load new configurations. Jun 05 09:44:59 worker-226 bash[7668]: Minimal reset level for device, 0000:ca:00.0: Jun 05 09:44:59 worker-226 bash[7668]: 3: Driver restart and PCI reset Jun 05 09:44:59 worker-226 bash[7668]: Continue with reset?[y/N] y Jun 05 09:44:59 worker-226 bash[7668]: -I- Sending Reset Command To Fw -Done Jun 05 09:45:06 worker-226 bash[7668]: -I- Stopping Driver -Done Jun 05 09:45:07 worker-226 bash[7668]: -I- Resetting PCI -Done Jun 05 09:45:15 worker-226 bash[7668]: -I- Starting Driver -Done Jun 05 09:45:15 worker-226 bash[7668]: -I- FW was loaded successfully. Jun 05 09:45:15 worker-226 bash[7668]: Switched to nic mode.
Additional info:
- clones
-
OCPBUGS-14591 bf2-switch-mode.sh fails if multiple devices are detected
- Closed
- depends on
-
OCPBUGS-14591 bf2-switch-mode.sh fails if multiple devices are detected
- Closed
- is cloned by
-
OCPBUGS-15097 bf2-switch-mode.sh fails if multiple devices are detected
- Closed
- links to
-
RHBA-2023:4730 OpenShift Container Platform 4.13.z bug fix update