Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: AI/ML Workloads, Node
Labels:
- RHAI

Target Version:
None
Activity Type:
Product / Portfolio Work
Status Summary:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Products:
None
Hierarchy Progress Bar:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Impact Score:
PX Impact Range:
None
PX Priority Data:
None
PX Technical Impact:
None
PX Technical Impact Notes:
None
PX Scheduling Request:
None

1- Speed up AI model-loading by having a pre-download mechanism for OCI volumes

2- LLM and GenAI workloads are latency-sensitive and frequently involve large models with long initialization times. These workloads often experience spiky traffic patterns and require responsive autoscaling to maintain performance.

By preloading models into the OCI volume. It enables inference services to start faster when scaling up, thus reducing the time between scaling decisions and the ability to serve requests. This is especially valuable in environments using KEDA or any autoscaler, as the infrastructure can respond to load changes with less delay and avoid cold-start bottlenecks.

links to

CRI-O Upstream Issue

Assignee:: Gaurav Singh

Reporter:: Myriam Fentanes

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/11/10 11:59 AM

Updated:: 2025/11/11 11:52 AM

Target start:: None

Target end:: None

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates