Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-11500

builder: vllm.py tag matcher should support per-variant accelerator suffixes (tpu, neuron, spyre)

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Development Platform
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Follow-up from AIPCC-11499 (CPU .cpu tag support). The vllm.py tag matcher in package_plugins currently groups TPU with CUDA (both use accelerator="cuda"). As we start cutting variant-specific tags (e.g. .tpu, .neuron, .spyre), the matcher needs to support per-variant accelerator suffixes so each variant can resolve its own tags.

      Current tag suffixes in use in nm-vllm-ent:

      • .cpu (v0.14.1+rhai1.cpu, v0.13.0+rhai6.cpu)
      • .cuda (v0.11.2+rhai0.cuda, v0.11.2rc1+rhai0.cuda)
      • .rocm (v0.11.2rc1+rhai0.rocm)
      • .neuron (v0.11.0+rhai0.neuron)

      Not yet in use but anticipated:

      • .tpu
      • .spyre

      The _create_vllm_matcher_with_legacy function should be updated to give each variant its own accelerator value so the ADR-114 matcher can resolve variant-specific tags independently.

      Related to AIPCC-11499 and INFERENG-4612.

              Unassigned Unassigned
              whardy@redhat.com Willy Hardy
              Antonio's Team
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: