Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Development Platform
Labels:
None

Blocked:
True
Blocked Reason:

Hide

Blocked on upstream support.

Show
Blocked on upstream support.
Ready:
False
Status Summary:

Hide

2026-Feb-06 red

Auto-dispatch feature for AVX2 / AVX512 still under development and not available in vLLM 0.14. An AVX512 build does not work on older hardware.

Show
2026-Feb-06 red Auto-dispatch feature for AVX2 / AVX512 still under development and not available in vLLM 0.14. An AVX512 build does not work on older hardware.

Target Version:

RHAIIS-3.4EA2

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

Feature Refinement Doc

Current distribution of vLLM supports NVIDIA GPU, Intel Gaudi, and ROCm. It would be great to have a version of vLLM that is capable of running on CPU without a GPU for smaller models.

The strat is limited to only x86 support.

List of models to validate for the initial support:

TinyLlama-1.1B-Chat-v1.0
Llama-3.2-1B-Instruct
granite-3.2-2b-instruct
TinyLlama-1.1B-Chat-v1.0-pruned2.4
TinyLlama-1.1B-Chat-v1.0-marlin
TinyLlama-1.1B-Chat-v0.4-pruned50-quant-ds
facebook/opt-125m
Qwen2-0.5B-Instruct-AWQ

Models performance evaluation resources / guides:

Midstream INFERENG CPU image build:
quay.io/vllm/automation-vllm:cpu-19905651936

In addition to the first delivery in RHAIIS 3.3 to support AVX2, this second delivery should support AVX2, AVX512, and AVX512 AMX in the same build.

relates to

AIPCC-7787 Build vllm components and images for CPU-only x86_64 AVX2 systems

Closed

Assignee:: Meirav Dean

Reporter:: Paige Vauter

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2026/01/29 11:05 PM

Updated:: 2026/02/19 2:08 PM

Target end:: 2026/02/13

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty