Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Networking
Labels:
- FPC:TODO-Close-ALL-Epics
- FPC:TODO-Create-Delivery-Epics

Work Type:
BU Product Work
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Hierarchy Progress Bar:

100% To Do, 0% In Progress, 0% Done

Risk Score:
0

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Intelligence Requested:
Market:

Feature Overview (aka. Goal Summary)

This feature extends Red Hat OpenShift’s SR-IOV networking capabilities to enable and optimize GPUDirect RDMA for AI/ML distributed training and inferencing workloads. By providing SR-IOV support for accelerated networking between GPUs, data-intensive applications can transfer large volumes of data directly between GPUs and NICs with minimal latency, improving performance, scalability, and overall resource efficiency for AI workloads.

Goals (aka. expected user outcomes)
Provide GA support:

https://docs.openshift.com/container-platform/4.17/networking/hardware_networks/using-dpdk-and-rdma.html#example-vf-use-in-rdma-mode-mellanox_using-dpdk-and-rdma

Requirements (aka. Acceptance Criteria):

A list of specific needs or objectives that a feature must deliver in order to be considered complete. Be sure to include nonfunctional requirements such as security, reliability, performance, maintainability, scalability, usability, etc. Initial completion during Refinement status.

Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.

Deployment considerations	List applicable specific needs (N/A = not applicable)
Self-managed, managed, or both
Classic (standalone cluster)
Hosted control planes
Multi node, Compact (three node), or Single node (SNO), or all
Connected / Restricted Network
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x)
Operator compatibility
Backport needed (list applicable versions)
UI need (e.g. OpenShift Console, dynamic plugin, OCM)
Other (please specify)

Use Cases (Optional):

Include use case diagrams, main success scenarios, alternative flow scenarios. Initial completion during Refinement status.

Questions to Answer (Optional):

Include a list of refinement / architectural questions that may need to be answered before coding can begin. Initial completion during Refinement status.

Out of Scope

High-level list of items that are out of scope. Initial completion during Refinement status.

Background

Provide any additional context is needed to frame the feature. Initial completion during Refinement status.

Customer Considerations

Provide any additional customer-specific considerations that must be made when designing and delivering the Feature. Initial completion during Refinement status.

Documentation Considerations

Provide information that needs to be considered and planned so that documentation will meet customer needs. If the feature extends existing functionality, provide a link to its current documentation. Initial completion during Refinement status.

Interoperability Considerations

Which other projects, including ROSA/OSD/ARO, and versions in our portfolio does this feature impact? What interoperability test scenarios should be factored by the layered products? Initial completion during Refinement status.

Assignee:: Marc Curry

Reporter:: Erwan Gallen

Doc Contact:: Ashley Hardin

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/01/09 12:45 PM

Updated:: 2025/01/16 7:10 PM

Details

Description

Feature Overview (aka. Goal Summary)

Requirements (aka. Acceptance Criteria):

Use Cases (Optional):

Questions to Answer (Optional):

Out of Scope

Background

Customer Considerations

Documentation Considerations

Interoperability Considerations

Attachments

Easy Agile Planning Poker

Activity

People

Dates