Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-1596

Inference performance baseline for AMD GPUs with vllm

XMLWordPrintable

    • Product / Portfolio Work
    • Inference, RHOAI
    • False
    • False
    • Hide

      None

      Show
      None
    • 8
    • PSAP - General-12, PSAP - General-13, PSAP - General-14

      User Story:

      Establish inference performance baselines for various load testing scenarios in llm load test. Models under consideration - the ones we currently test in model serving CPT + granite 7b + granite 3 8b + if there is anything else RHOAI QE is testing

      Acceptance criteria:

      Performance test report
      Analysis
      Any guidance for docs
      Next steps

              ccamacho@redhat.com Carlos Camacho
              ccamacho@redhat.com Carlos Camacho
              David Whyte-Gray, Nikhil Palaskar
              PerfScale PSAP
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: