-
Feature
-
Resolution: Unresolved
-
Critical
-
rhelai-1.5
Feature Overview:
This Feature card is part of validating 3rd-party inference models with ilab serve with the Instructlab component for RHELAI 1.5
3rd-party model for this card: Qwen/Qwen2.5-7B-Instruct
Goals:
- Run Qwen 2.5 7B with the i lab serve command in InstructLab - functional test
- No errors/warnings arise
- Run for all quantized variants of the model (Base, INT4, INT8, FP8)
Out of Scope [To be updated post-refinement]:
- Evaluating the performance of the chat
- Evaluating Accuracy
- That vLLM works w/ this model - this will be confirmed before this testing happens
Requirements:
- Functional Requirements:
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
- Ilab model download is able to download the model from quay or huggingface
- Ilab model list to view the downloaded model
- Ilab model serve can serve the model
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
Done - Acceptance Criteria:
- Ensures all functional requirements are met
Model Quantization Level Confirmed Qwen/Qwen2.5-7B-Instruct Baseline Qwen/Qwen2.5-7B-Instruct INT4 INT4 Qwen/Qwen2.5-7B-Instruct INT8 INT8 Qwen/Qwen2.5-7B-Instruct FP8 FP8
- All Quantized versions of the Model are confirmed to meet the requirements and all have a 'X' in the Confirmed boxes
Use Cases - i.e. User Experience & Workflow:
- User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command
- User can then view the model with Ilab model list
- User can then serve the model for inference on vLLM-ent with Ilab model serve
Documentation Considerations:{}
- Update relevant documentation to expose new 3rd party model to users (ie. Chapter 3. Downloading Large Language models)
Questions to answer:
- Are there any code changes to serve a model? Note, the model will already be validated in vllm-ent BEFORE this testing occurs so this is just ensuring the ilab commands work, NOT that vLLM works w/ this model.
- Answer: Yes, and it is documented https://issues.redhat.com/browse/RHELAI-3616
- Will this need to be manually done or kicked off by the E2E larger pipeline by Liora's team?
Background & Strategic Fit:
Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.
See Red Hat AI Model Validation Strategy Doc
See Redhat Q1 2025 Third Party Model Validation Presentation
- clones
-
RHELAI-3559 [ilab] Running a third-party Llama 3.3 70B Instruct model as teacher model (+ inference functional testing) in ilab tuning flow
-
- In Progress
-
- is cloned by
-
RHELAI-3607 [ilab] Llama 3.1 8B-Instruct ilab model serve (inference functional testing)
-
- Testing
-
-
RHELAI-3596 [ilab] Mistral-Small 3-24B-Instruct ilab model serve (inference functional testing)
-
- Testing
-
-
RHELAI-3622 Qwen-2.5 7B-Instruct RHELAI vllm inference flow
-
- Closed
-