-
Feature
-
Resolution: Unresolved
-
Major
-
rhelai-1.5
Feature Overview:
This Feature card is part of validating 3rd-party inference models with ilab serve with the Instructlab component for RHELAI 1.5
3rd-party model for this card: Mistral Small 3 24B-Instruct
Goals:
- Run Mistral Small 3 24B-Instruct with the i lab serve command in InstructLab - functional test
- No errors/warnings arise
- Run for all quantized variants of the model (Base, INT4, INT8, FP8)
Out of Scope [To be updated post-refinement]:
- Evaluating the performance of the chat
- Evaluating Accuracy
- That vLLM works w/ this model - this will be confirmed before this testing happens
Requirements:
- Functional Requirements:
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
- Ilab model download is able to download the model from quay or huggingface
- Ilab model list to view the downloaded model
- Ilab model serve can serve the model
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
Done - Acceptance Criteria:
- Ensure all functional requirements are met
Model Quantization Level Confirmed Mistral Small 3 24B-Instruct Baseline Mistral Small 3 24B-Instruct INT4 INT4 Mistral Small 3 24B-Instruct INT8 INT8 Mistral Small 3 24B-Instruct FP8 FP8
Use Cases - i.e. User Experience & Workflow:
- User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command
- User can then view the model with Ilab model list
- User can then serve the model for inference on vLLM-ent with Ilab model serve
Documentation Considerations:{}
- Update relevant documentation to expose new 3rd party model to users (ie. Chapter 3. Downloading Large Language models)
Questions to answer:
Background & Strategic Fit:
Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.
See Red Hat AI Model Validation Strategy Doc
See Redhat Q1 2025 Third Party Model Validation Presentation
- clones
-
RHELAI-3595 [ilab] Qwen-2.5 7B-Instruct ilab model serve (inference functional testing)
-
- Testing
-