Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-1434

(vLLM) Test perf impact of chunked prefill, splitwise prefill, and decode disaggregation

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Obsolete
    • Icon: Normal Normal
    • Jan 13
    • None
    • None
    • None

      User Story:
      As a a performance engineer, I want to measure the performance benefit of new vLLM features (chunked prefills and splitwise prefill & decode disaggregation) in order to have data which can be shared with engineering teams and customers seeking guidance.

      If these are beneficial we can use them in some of our CPT testing configurations to track the impact.

      Acceptance criteria:

              dagray@redhat.com David Whyte-Gray
              dagray@redhat.com David Whyte-Gray
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: