Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-1112

Performance and Scale testing for RHOAI releases with KServe stack

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • RHODS
    • Performance and Scale testing for RHOAI releases with KServe stack
    • Inference, RHOAI
    • Not Selected
    • False
    • False
    • None
    • 0% To Do, 0% In Progress, 100% Done

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      <--- Cut-n-Paste the entire contents of this description into your new Epic --->

      Epic Goal

      • Performance
        • Run workloads on IBM pre-trained models and the curated huggingface open source models to get the throughput and latency numbers
        • Make sure that the performance numbers meet the GA requirements 
        • Test with GPU sharing techniques specifically MIG and to come up with a best practices guide on using MIG with watsonx models
      • Scalability
        • Make sure that the stack is scalable as the load is increased
        • test the robustness of the stack at high scale
      • Establish a performance and scale tuning guide for the serving stack
      • Socialize the results

      Why is this important?

      • Ensure performance and scalability of the model serving stack 

      Scenarios

      1. model performance on a single GPUs
      2. model performance on multiple GPUs
      3. model serving stack scalability across multiple GPU nodes

      Acceptance Criteria

      • Test automation in ci-artifacts 
      • Regression analysis in Horreum
      • Published tuning and scale guide including MIG 
      • Blog post(s) for socializing the results
      •  

      Dependencies (internal and external)

      1. Availability of builds with the stack from IBM And OpenShfit AI eng teams

      Previous Work (Optional):

      1. Ansible Lightspeed performance

      Open questions::

      1. What are the performance requirements?
      2. What all platforms we need to test - ROSA, ROKS, on-prem?
      3. What all CPUs/GPUs need to be tested?

              dagray@redhat.com David Gray
              akamra8979 Ashish Kamra
              Kevin Pouget
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: