-
Feature
-
Resolution: Unresolved
-
Major
-
rhelai-1.5
Feature Overview:
This Feature card is part of validating 3rd-party inference models with ilab serve with the Instructlab component for RHELAI 1.5
3rd-party model for this card: Granite 3.1 8B Instruct
Goals:
- Run Phi 4 14B with the i lab serve command in InstructLab - functional test
- Chat with it to confirm it functions
- No errors/warnings arise
- Run for all quantized variants of the model (Base, INT4, INT8, FP8)
Out of Scope [To be updated post-refinement]:
- Evaluating the performance of the chat
- Evaluating Accuracy
- That vLLM works w/ this model - this will be confirmed before this testing happens
Requirements:
- Functional Requirements:
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
- Ilab model download is able to download the model from quay or huggingface
- Ilab model list to view the downloaded model
- Ilab model serve can serve the model
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
Done - Acceptance Criteria:
- QE ensures all functional requirements are met
Model Quantization Level Confirmed Granite 3.1 8B Instruct Baseline Granite 3.1 8B Instruct INT4 INT4 Granite 3.1 8B Instruct INT8 INT8 Granite 3.1 8B Instruct FP8 FP8
Use Cases - i.e. User Experience & Workflow:
- User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command
- User can then view the model with Ilab model list
- User can then serve the model for inference on vLLM-ent with Ilab model serve
Documentation Considerations:{}
- Update relevant documentation to expose new 3rd party model to users (ie. Chapter 3. Downloading Large Language models)
Questions to answer:
- Are there any code changes to serve a model? Note, the model will already be validated in vllm-ent BEFORE this testing occurs so this is just ensuring the ilab commands work, NOT that vLLM works w/ this model.
Background & Strategic Fit:
Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.
See Red Hat AI Model Validation Strategy Doc
See Redhat Q1 2025 Third Party Model Validation Presentation
- clones
-
RHELAI-3598 [ilab] Phi-4-14B ilab model serve (inference functional testing)
-
- Testing
-