-
Epic
-
Resolution: Unresolved
-
Major
-
rhelai-1.5
-
[ilab] Phi-4-14B ilab model serve (inference functional testing)
-
False
-
-
False
Feature Overview:
This Feature card is part of validating 3rd-party inference models with ilab serve with the Instructlab component for RHELAI 1.5
3rd-party model for this card: Phi 4 14B
Goals:
- Run Phi 4 14B with the i lab serve command in InstructLab - functional test
- No errors/warnings arise
- Run for all quantized variants of the model (Base, INT4, INT8, FP8)
Out of Scope [To be updated post-refinement]:
- Evaluating the performance of the chat
- Evaluating Accuracy
- That vLLM works w/ this model - this will be confirmed before this testing happens
Requirements:
- Functional Requirements:
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
- Ilab model download is able to download the model from quay or huggingface
- Ilab model list to view the downloaded model
- Ilab model serve can serve the model
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
Done - Acceptance Criteria:
- Ensures all functional requirements are met
Model Quantization Level Confirmed Phi 4 14B Baseline Phi 4 14B INT4 INT4 Phi 4 14B INT8 INT8 Phi 4 14B FP8 FP8
Use Cases - i.e. User Experience & Workflow:
- User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command
- User can then view the model with Ilab model list
- User can then serve the model for inference on vLLM-ent with Ilab model serve
Documentation Considerations:{}
- Update relevant documentation to expose new 3rd party model to users (ie. Chapter 3. Downloading Large Language models)
Questions to answer:
- Are there any code changes to serve a model? Note, the model will already be validated in vllm-ent BEFORE this testing occurs so this is just ensuring the ilab commands work, NOT that vLLM works w/ this model.
Background & Strategic Fit:
Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.
See Red Hat AI Model Validation Strategy Doc
See Redhat Q1 2025 Third Party Model Validation Presentation
- is blocked by
-
RHELAI-3616 Third-party model(s) support - for the end-to-end workflow and inference
-
- In Progress
-
- is cloned by
-
RHELAI-3674 [ilab] Granite 3.1 8B Instruct ilab model serve (inference functional testing)
-
- Testing
-