-
Epic
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
None
-
RHOAI Model Serving CPT Q4 2024
-
Inference, RHOAI
-
Not Selected
-
False
-
False
-
None
-
13
-
90% To Do, 10% In Progress, 0% Done
Epic Goal
Performance testing for RHOAI model serving is an ongoing effort including running the CPT, analyzing the results, expanding our test coverage, and iterating on the tools involved. This epic is meant to capture the model serving performance work that doesn't require a full epic of it's own for tracking.
Why is this important?
- The underlying components are quickly evolving and there is a constant stream of new capabilities and configurations for us to test.
- LLM model serving is currently a top priority for OpenShift AI and the company.
- These workloads are performance-sensitive and require expensive hardware to run effectively. Many customers are interested in leveraging LLMs for their business use-cases, but performance and cost efficiency are critical in doing so.
- We need to catch any potential regressions in the LLM model serving stack in RHOAI as early as possible