-
Story
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
-
Follow-up from AIPCC-11499 (CPU .cpu tag support). The vllm.py tag matcher in package_plugins currently groups TPU with CUDA (both use accelerator="cuda"). As we start cutting variant-specific tags (e.g. .tpu, .neuron, .spyre), the matcher needs to support per-variant accelerator suffixes so each variant can resolve its own tags.
Current tag suffixes in use in nm-vllm-ent:
- .cpu (v0.14.1+rhai1.cpu, v0.13.0+rhai6.cpu)
- .cuda (v0.11.2+rhai0.cuda, v0.11.2rc1+rhai0.cuda)
- .rocm (v0.11.2rc1+rhai0.rocm)
- .neuron (v0.11.0+rhai0.neuron)
Not yet in use but anticipated:
- .tpu
- .spyre
The _create_vllm_matcher_with_legacy function should be updated to give each variant its own accelerator value so the ADR-114 matcher can resolve variant-specific tags independently.
Related to AIPCC-11499 and INFERENG-4612.