Loading...

XML

Word

Printable

Type: Story
Resolution: Obsolete
Priority: Normal
Fix Version/s: Jan 13
Affects Version/s: None
Component/s: None
Labels:
- Grooming
- intake-form

Workstream:

Inference, RHOAI
Ready:
False
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Links:
SFDC Cases Open:

Intelligence Requested:
Market:

User Story:
As a PSAP engineer, I want to prepare a POC showing the usefulness of in-place Pod resource resize published under feature gate in OCP 4.14.

I'm thinking about this pattern that I observed in the memory usage:

We see that between 20:30 and 20:35, the model was being loaded in the GPU memory, and the RAM memory consumption spiked above 10GB, and then went down to maybe 1 GB. So with this pattern and the current immutable memory request/limit setting, the memory request must be 11GB, otherwise the model loading might fail.

With in-place pod resource resize, can can change it to 2GB when the Pod reports as ready for serving requests.

The POC would consist in instantiating multiple models concurrently, showing that without setting request=11GB, they'll fail randomly, and with in-place resize, they dont' fail anymore and better utilize the available resources.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image-2023-11-15-17-13-52-942.png
208 kB
2023/11/15 4:13 PM

Assignee:: Yuchen Fama

Reporter:: Kevin Pouget

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2023/11/15 4:19 PM

Updated:: 2026/01/30 7:44 PM

Resolved:: 2026/01/05 8:42 PM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates