Loading...

XML

Word

Printable

Type: Story
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Development Platform
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Follow-up from ~~AIPCC-11499~~ (CPU .cpu tag support). The vllm.py tag matcher in package_plugins currently groups TPU with CUDA (both use accelerator="cuda"). As we start cutting variant-specific tags (e.g. .tpu, .neuron, .spyre), the matcher needs to support per-variant accelerator suffixes so each variant can resolve its own tags.

Current tag suffixes in use in nm-vllm-ent:

.cpu (v0.14.1+rhai1.cpu, v0.13.0+rhai6.cpu)
.cuda (v0.11.2+rhai0.cuda, v0.11.2rc1+rhai0.cuda)
.rocm (v0.11.2rc1+rhai0.rocm)
.neuron (v0.11.0+rhai0.neuron)

Not yet in use but anticipated:

.tpu
.spyre

The _create_vllm_matcher_with_legacy function should be updated to give each variant its own accelerator value so the ADR-114 matcher can resolve variant-specific tags independently.

Related to ~~AIPCC-11499~~ and INFERENG-4612.

Assignee:: Unassigned

Reporter:: Willy Hardy

Team:: Antonio's Team

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2026/03/04 5:45 PM

Updated:: 2026/03/04 5:45 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty