Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Normal
Fix Version/s: openshift-4.21
Affects Version/s: None
Component/s: ai-ml-workloads, Node
Labels:

Activity Type:
Product / Portfolio Work
Parent Link:
OCPSTRAT-1692AI Workloads for OpenShift
Hierarchy Progress Bar:

0% To Do, 100% In Progress, 0% Done
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Size:
None

Target Version:

openshift-4.21
Release Blocker:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Priority Data:
None
PX Impact Score:
PX Technical Impact:
None
PX Impact Range:
None
PX Scheduling Request:
None
PX Technical Impact Notes:
None

Intelligence Requested:
Market:

Background:
In OpenShift, the Dynamic Resource Allocation (DRA) framework manages the allocation of specialized hardware resources like GPUs and NICs. Traditionally, DRA requires explicit specification of the exact resource type, limiting flexibility and potentially leading to suboptimal resource utilization.

Enhancement Summary:
KEP-4816 introduces the ability to specify a prioritized list of acceptable device types within a single resource claim. This means that a pod can request multiple alternative devices, ordered by preference, and the scheduler will attempt to allocate the highest-priority available device. This enhancement allows for more flexible and efficient scheduling, especially in heterogeneous hardware environments.GitHub

Use Cases in OpenShift

Use Case	Description
Flexible GPU Allocation	Allow AI/ML workloads to utilize any available GPU type (e.g., NVIDIA A100, V100, T4), prioritizing preferred models but falling back to others if necessary.
Mixed Hardware Environments	Enable workloads to run on nodes with varying hardware capabilities by specifying acceptable alternatives.
Resource Optimization	Improve cluster resource utilization by allowing the scheduler to choose from multiple device types, reducing scheduling failures due to strict resource requirements.
Simplified Deployment Configurations	Reduce the need for multiple deployment manifests targeting specific hardware, streamlining application deployment processes.

links to

openshift/api#2498: OCPNODE-3895,OCPNODE-3893,OCPNODE-3779: Enable DRA(DynamicResourceAllocation) featuregate by default

Assignee:: Gaurav Singh

Reporter:: Gaurav Singh

Need Info From:: None

Contributors:: Sai Ramesh Vanka

Architect:: Mrunal Patel

QA Contact:: Aruna Naik

Doc Contact:: Matthew Werner

Product Operations Engineering Contact:: Derrick Ornelas

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2025/04/14 5:46 PM

Updated:: 2025/11/21 7:37 PM

Details

Description

Use Cases in OpenShift

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates