-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
rhel-9.6
-
None
-
No
-
None
-
rhel-net-drivers-1
-
None
-
False
-
False
-
-
None
-
None
-
None
-
None
-
Unspecified
-
Unspecified
-
Unspecified
-
None
Currently, we successfully use the following setup on Marvell DPUs:
- host and DPU runs recent rhel-9.6 kernels
- DPU has firmwares
- flash-cn10ka-SDK12.25.01.img (uboot for primary)
- flash-uefi-cn10ka-12.25.01.img (uefi for secondary)
Firmware Version: 2025-01-30 22:07:22
EBF Version: 12.25.01, Branch: /home/fae/PANW-LIO3/SDK122501/work/cn10ka-pcie-ep-release-output/build/marvell-external-fw-SDK12.25.01/firmware/ebf, Built: Thu, 30 Jan 2025 22:06:12 +0000
- run cp-agent 25.03.0
- https://github.com/MarvellEmbeddedProcessors/pcie_ep_octeon_target/releases/tag/25.03.0
- https://github.com/openshift/dpu-operator/blob/44accd0d9ba5cf444a88e7d38825850ba914e220/Dockerfile.mrvlCPAgent.rhel#L23
- command `octep_cp_agent /usr/bin/cn106xx.cfg – --dpi_dev 0000:06:00.0 --pem_dev 0001:00:10.0`
We want to update to a newer firmware. So Kiet sent me
- flash-uefi-cn10ka-SDK12.25.06.img
Firmware Version: 2025-07-30 15:48:38
EBF Version: 12.25.06, Branch: /sdk/SDK12.25.06/cn10ka-vpp-release-output/build/marvell-external-fw-SDK12.25.06/firmware/ebf, Built: Wed, 30 Jul 2025 15:46:57 +0000
Using the new firmware also requires a new octep_cp_agent version.
1. With new firmware and using cp-agent 25.07.0)
when using https://github.com/MarvellEmbeddedProcessors/pcie_ep_octeon_target/releases/tag/25.07.0, then `modprobe octeon_ep` fails with:
- modprobe octeon_ep
[15699.824623] octeon_ep: Loading Marvell Octeon EndPoint NIC Driver ...
[15699.824919] octeon_ep 0000:87:00.0: Firmware ready status = 1
[15699.824986] octeon_ep 0000:87:00.0: chip_id = 0xb900
[15699.824988] octeon_ep 0000:87:00.0: Setting up OCTEON CN10KA PF PASS1.0
[15699.824991] octeon_ep 0000:87:00.0: Octeon device using PCIE Port 0
[15699.824994] octeon_ep 0000:87:00.0: SDP_EPF_RINFO[0x209f0]:0x8000800000008
[15699.824997] octeon_ep 0000:87:00.0: SDP_MAC_PF_RING_CTL[0]:0x80001
[15699.824999] octeon_ep 0000:87:00.0: pf_srn=0 rpvf=8 nvfs=8 rppf=8
[15699.825026] Octep ctrl mbox : Init successful.
[15699.825033] octeon_ep 0000:87:00.0: Control plane versions host: 10000, firmware: 10000:10000
[15700.328482] octeon_ep 0000:87:00.0: Failed to get firmware info
[15700.329127] octeon_ep 0000:87:00.0: Cleaning up Octeon Device ...
[15700.329143] Octep ctrl mbox : Uninit successful.
[15700.329145] octeon_ep 0000:87:00.0: CNXKXX: Doing soft reset
[15700.339271] octeon_ep: probe of 0000:87:00.0 failed with error -11
[15700.339345] octeon_ep: Loaded successfully !
In this case, I was running `octep_cp_agent /usr/bin/cn106xx.cfg – --sdp_rvu_pf 0002:18:00.0,0002:19:00.0 --pem_dev 0001:00:10.0`
2. With new firmware and using cp-agent 25.04.0)
Instead, when using https://github.com/MarvellEmbeddedProcessors/pcie_ep_octeon_target/releases/tag/25.04.0 it appears to work at first. That means, `modprobe octeon_ep` connects and I can manually create VFs via `sriov_numvfs`. However, when running Openshift with dpu-operator, then the host side quickly crashes.
In this case, I was running `octep_cp_agent /usr/bin/cn106xx.cfg – --sdp_rvu_pf 0002:19:00.0 --pem_dev 0001:00:10.0`
—
This report contains two cases for different cp-agent version. But the real goal is to successfully use a new firmware with whatever cp-agent version that works.