-
Task
-
Resolution: Done
-
Major
-
None
-
None
-
None
Story (Required)
The AI samples in https://github.com/redhat-ai-dev/ai-lab-samples should be able to support vLLM as a model server.
Background (Required)
After doing some testing with the samples in https://github.com/redhat-ai-dev/ai-lab-samples in OpenShift AI with vLLM, I've found that neither chatbot or codegen currently support vLLM.
The codegen sample doesn't provide an option to select a model, and defaults to a GPT 3.5 model, which wasn't available on the vLLM deployments we've been testing. The chatbot sample doesn't specify any sample either, and it defaults to nil, which seems to trip up vLLM. In both cases, explicitly providing the model that's being used (e.g. Mistral 7b), allows vLLM to work with the app.
The samples should be updated to allow the model to be configured when using vLLM.
I've opened https://github.com/containers/ai-lab-recipes/pull/519 to modify the samples we pull in, by allowing the model to be specified at deployment via an environment variable.
Out of scope
<Defines what is not included in this story>
Approach (Required)
<Description of the general technical path on how to achieve the goal of the story. Include details like json schema, class definitions>
Dependencies
<Describes what this story depends on. Dependent Stories and EPICs should be linked to the story.>
Acceptance Criteria (Required)
<Describe edge cases to consider when implementing the story and defining tests>
<Provides a required and minimum list of acceptance tests for this story. More is expected as the engineer implements this story>
documentation updates (design docs, release notes etc)
demo needed
SOP required
education module update (Filled by DEVHAS team only)
R&D label required (Filled by DEVHAS team only)
Done Checklist
Code is completed, reviewed, documented and checked in
Unit and integration test automation have been delivered and running cleanly in continuous integration/staging/canary environment
Continuous Delivery pipeline(s) is able to proceed with new code included
Customer facing documentation, API docs, design docs etc. are produced/updated, reviewed and published
Acceptance criteria are met
If the Grafana dashboard is updated, ensure the corresponding SOP is updated as well