Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-1433

RHOAI Model Serving performance Q4 2024

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • None
    • None
    • RHOAI Model Serving CPT Q4 2024
    • Inference, RHOAI
    • Not Selected
    • False
    • False
    • None
    • 13
    • 90% To Do, 10% In Progress, 0% Done

      Epic Goal

      Performance testing for RHOAI model serving is an ongoing effort including running the CPT, analyzing the results, expanding our test coverage, and iterating on the tools involved. This epic is meant to capture the model serving performance work that doesn't require a full epic of it's own for tracking. 

      Why is this important?

      • The underlying components are quickly evolving and there is a constant stream of new capabilities and configurations for us to test.
      • LLM model serving is currently a top priority for OpenShift AI and the company.
      • These workloads are performance-sensitive and require expensive hardware to run effectively. Many customers are interested in leveraging LLMs for their business use-cases, but performance and cost efficiency are critical in doing so. 
      • We need to catch any potential regressions in the LLM model serving stack in RHOAI as early as possible

       

              dagray@redhat.com David Gray
              dagray@redhat.com David Gray
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: