XML

Word

Printable

Type: Task
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- AI
- devai

Story Points:
5
Epic Link:
Software template for AI development
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

Sprint:
HAS Sprint 2264

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Task Description (Required)

https://www.reddit.com/r/LocalLLaMA/comments/18g21af/vllm_vs_llamacpp/

talks about the difference between vLLM and llama.cpp

llama.cpp does better when lack off GPU or VRAM, but vLLM has better performance since it takes GPU

The current software template uses llama.cpp, but the chatbot and codegen generates responses too slow.

This issue is too investigate if vLLM is going to have a better performance with chatbot and codegen samples, and how feasible to adopt it into the ai software template

If this requires Change Management, complete sections below:

Change Request

<Select which item is being changed>

[ ] Add New Tokens

[ ] Rotate Tokens

[ ] Remove Tokens

[ ] Others: (specify)

Environment

<Select which environment the change is being made on. If both, open a separate issue so changes are tracked in each environment>

[ ] Stage OR

[ ] Prod

Backout Plan

<State what steps are needed to roll back in case something goes wrong>

Downtime

<Is there any downtime for these changes? If so, for how long>

Risk Level

<How risky is this change?>

Testing

<How are changes verified?>

Communication

<How are service owners or consumers notified of these changes?>

clones

RHIDP-10563 first pass of software template and gitops app definitions uses pre-built image

Closed

is cloned by

RHIDP-10740 implement gitops template using vLLM for the AI templates

Closed

Assignee:: Stephanie Cao

Reporter:: Elson Yuen

Team:: RHIDP - AI

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2024/05/29 9:30 PM

Updated:: 2025/11/25 9:10 PM

Resolved:: 2024/06/03 7:59 PM

Details

Description

Task Description (Required)

Change Request

Environment

Backout Plan

Downtime

Risk Level

Testing

Communication

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates