-
Epic
-
Resolution: Done
-
Normal
-
None
-
None
-
None
-
None
-
LLM load testing enhancements
-
Inference, RHOAI
-
Not Selected
-
False
-
False
-
-
0% To Do, 0% In Progress, 100% Done
Epic Goal
Improve our load testing tool llm-load-test and related automation to keep up with best practices / state of the art:
- Use a dataset that is representative of a wider set of use cases – input/output tokens ranging from 0 to 4096 tokens, with option to configure the bounds for each test
- Measure time to first token and time per output token (TTFT, TPOT)
- Develop a load generator that can be used to test models in various runtimes with different interfaces (GRPC, HTTP)
- Use mlcommons loadgen to drive load
- Potentially integrate mlcommons loadgen into our load testing tool
Why is this important?
- …
Scenarios
- ...
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents.
- ...
Dependencies (internal and external)
- ...
Previous Work (Optional):
- …
Open questions::
- …
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>