-
Bug
-
Resolution: Unresolved
-
Blocker
-
None
-
rhel-9.2.0
-
None
-
No
-
None
-
rhel-sst-network-management
-
ssg_networking
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
What were you trying to do that didn't work?
Integrate OpenShift 4.15 worker nodes, running on Dell HGX (XE9680) hardware, through Infiniband NVIDIA/Mellanox cards https://catalog.redhat.com/hardware/components/detail/243877 with DataDirect Networks (DDN) EXAScaler appliances.
DDN does have their CSI driver certified for OpenShift and bundle as an Operator https://catalog.redhat.com/software/container-stacks/detail/64a84303aa534dcfb968dffa / https://github.com/redhat-openshift-ecosystem/certified-operators/blob/main/operators/exascaler-csi-driver-operator/2.2.6/metadata/annotations.yaml, and they claim that this also been tested within Infiniband cards on OpenShift.
According to DDN field engineers, to make the DDN CSI work with IB and DDN storage, OpenShift worker nodes needs 1) at least two IB interfaces, 2) one IP on each interface 3) DDN storage IP needs to be accessible simultaneously from both IB interfaces.
As Kubernetes NMstate Operator is the default way (declarative approach) to manage secondary interfaces on OpenShift, we are trying to use it to also managed IB interfaces.
NMstate does support IB interfaces https://nmstate.io/devel/yaml_api.html#ip-over-infiniband-interface, but it seems that we are lacking on downstream documentation on RHEL and OpenShift for this use case.
What is the impact of this issue to you?
Protect invest on existent hardware which does already work on other Linux and K8s distributions https://www.ddn.com/blog/ddn-expands-support-for-nvidia-technology-to-enable-ai-application-acceleration-for-data-center-infrastructure/
Deliver AI/HPC environment for strategic end customers.
Expand RH ecosystem with specialized hardware for AI/HPC workloads.
Please provide the package NVR for which the bug is seen:
nmstate-2.2.24-1.el9_2.x86_64
How reproducible is this bug?
Under investigation.
Steps to reproduce
- Have specialized hardware in place, like [Dell HGX (XE9680)|https://www.delltechnologies.com/asset/en-in/products/servers/technical-support/poweredge-xe9680-spec-sheet.pdf] and
DataDirect Networks (DDN) EXAScaler appliances
- Install NVIDIA Network Operator https://docs.nvidia.com/networking/display/kubernetes2410/nvidia+network+operator to manage specialized Drivers for NVIDIA/Mellanox IB interfaces
https://catalog.redhat.com/hardware/components/detail/243877
- Install DDN CSI Drivers https://github.com/DDNStorage/exa-csi-driver/tree/master?tab=readme-ov-file#openshift
- Manage to have IB Switches and connectivity established between Infiniband interfaces and target DDN storage
- Set static IP and routing per OpenShift Node as listed bellow:
apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: worker06 spec: desiredState: interfaces: - description: "" infiniband: mode: datagram pkey: "0xffff" ipv4: address: - ip: 100.125.3.4 prefix-length: 16 dhcp: false enabled: true ipv6: enabled: false name: ibp27s0 state: up type: infiniband - description: "" infiniband: mode: datagram pkey: "0xffff" ipv4: address: - ip: 100.125.2.4 prefix-length: 16 dhcp: false enabled: true ipv6: enabled: false name: ibp157s0 state: up type: infiniband routes: config: - destination: 0.0.0.0/0 metric: 100 next-hop-address: 100.125.3.4 next-hop-interface: ibp27s0 table-id: 100 - destination: 0.0.0.0/0 metric: 101 next-hop-address: 100.125.2.4 next-hop-interface: ibp157s0 table-id: 101 route-rules: config: - ip-to: 100.125.0.0/16 ip-from: 100.125.3.4 priority: 100 route-table: 100 - ip-to: 100.125.0.0/16 ip-from: 100.125.2.4 priority: 200 route-table: 101 nodeSelector: kubernetes.io/hostname: worker06.analog.hgx.core42.hpc
Expected results
Use RH's recommended approach to manage network interfaces with OpenShift 4.x.
Actual results
Not clear documentation for this use case.
Also good to recap that the 1) this Infiniband cards are certified https://catalog.redhat.com/hardware/components/detail/243877, and 2) current documented scenario with Infiniband interfaces on OpenShift that I can see is this one with SR-IOV Operator, which is commonly used with specialized hardware, for AI and HPC use cases but no IPoIB is configured directly to the OCP nodes in this case https://docs.nvidia.com/networking/display/public/sol/rdg+for+accelerating+ai+workloads+in+red+hat+ocp+with+nvidia+dgx+a100+servers+and+nvidia+infiniband+fabric#src-99399137_RDGforAcceleratingAIWorkloadsinRedHatOCPwithNVIDIADGXA100ServersandNVIDIAInfiniBandFabric-LogicalDesign