Loading...

XML

Word

Printable

Type: Spike
Resolution: Obsolete
Priority: Normal
Fix Version/s: Jan 13
Affects Version/s: None
Component/s: None
Labels:
- Grooming
- intake-form

Activity Type:
Future Sustainability
Epic Link:
RHOAI future ideas
Workstream:

Inference, RHELAI
Ready:
False
Blocked:
False
Blocked Reason:

Hide

None

Show
None

Story Points:
8

SFDC Cases Counter:
SFDC Cases Links:
SFDC Cases Open:

Intelligence Requested:
Market:

Use llm load test against vllm to get comparative performance data across AWS Neuron and Google TPUs for llama-3.1-8b and granite-3 8b models
https://docs.vllm.ai/en/latest/getting_started/neuron-installation.html
https://docs.vllm.ai/en/latest/getting_started/tpu-installation.html

This is not a product ask (yet). This is forward looking work to provide product guidance on which accelerator to prioritize for RHEL AI inference only use cases.

Assignee:: Yuchen Fama

Reporter:: Ashish Kamra

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2024/11/18 8:29 PM

Updated:: 2026/01/30 7:43 PM

Resolved:: 2026/01/05 9:01 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates