-
Story
-
Resolution: Done
-
Normal
-
None
-
None
-
None
-
False
-
False
-
None
User Story:
As a performance engineer
I want a general purpose load testing tool for performance testing large language models and the underlying platform they are deployed on (modelmesh / kserve / watsonx stack). This tool should be able to query models with a gRPC or REST API
So that I can use it to test many different models with only minor changes to a config file.
Notes:
- We created github.com/openshift-psap/llm-load-test for testing the Ansible Lightspeed model but it is currently just a short set of scripts that is very hardcoded for this model. We should see if we can leverage an existing tool like iter8, or adapt llm-load-test to build on top of it.
Acceptance criteria: