-
Feature
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
Product / Portfolio Work
-
None
-
False
-
-
False
-
None
-
None
-
None
-
None
-
None
-
-
None
-
None
-
None
-
None
Feature Overview (aka. Goal Summary)
This feature extends Red Hat OpenShift’s SR-IOV networking capabilities to enable and optimize GPUDirect RDMA for AI/ML distributed training and inferencing workloads. By providing SR-IOV support for accelerated networking between GPUs, data-intensive applications can transfer large volumes of data directly between GPUs and NICs with minimal latency, improving performance, scalability, and overall resource efficiency for AI workloads.
The need it to remove this note:

Goals (aka. expected user outcomes)
Provide GA support with a new GPU section in this documentation:
Requirements (aka. Acceptance Criteria):
SR-IOV Network Operator supporting NVIDIA GPUDirect RDMA
Use Cases:
This feature enables NVIDIA clear support statement for:
- NVIDIA GPUDirect RDMA
- Distributed Red Hat OpenShift AI PyTorch training with support for multi-GPU and multi-node configurations.
- NVIDIA GPUDirect Storage
Documentation Considerations
Documentation should be updated with the section on NVIDIA GPU.