Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Critical
Fix Version/s: rhelai-1.5
Affects Version/s: rhelai-1.5
Component/s: InstructLab - Core, Instructlab - Research, InstructLab - Training
Labels:
- 1.5-candidate
- model-validation

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Parent Link:
RHELAI-3557RHEL AI Third-Party Model Validation Deliverables for Summit '25

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

Feature Overview:
This Feature card is part of validating 3rd-party inference models with ilab serve with the Instructlab component for RHELAI 1.5

3rd-party model for this card: Qwen/Qwen2.5-7B-Instruct

Goals:

Run Qwen 2.5 7B with the i lab serve command in InstructLab - functional test
No errors/warnings arise
Run for all quantized variants of the model (Base, INT4, INT8, FP8)

Out of Scope [To be updated post-refinement]:

Evaluating the performance of the chat
Evaluating Accuracy
That vLLM works w/ this model - this will be confirmed before this testing happens

Requirements:

Functional Requirements:
- Ensure below components of the flow are functional with the 3rd party model for the inference use case:
  - Ilab model download is able to download the model from quay or huggingface
  - Ilab model list to view the downloaded model
  - Ilab model serve can serve the model

Done - Acceptance Criteria:

Ensures all functional requirements are met

Model Quantization Level Confirmed

Qwen/Qwen2.5-7B-Instruct Baseline

Qwen/Qwen2.5-7B-Instruct INT4 INT4

Qwen/Qwen2.5-7B-Instruct INT8 INT8

Qwen/Qwen2.5-7B-Instruct FP8 FP8

All Quantized versions of the Model are confirmed to meet the requirements and all have a 'X' in the Confirmed boxes

Use Cases - i.e. User Experience & Workflow:

User downloads 3rd party model from Redhat Registry/quay or HF via ilab model download command
User can then view the model with Ilab model list
User can then serve the model for inference on vLLM-ent with Ilab model serve

Documentation Considerations:{}

Update relevant documentation to expose new 3rd party model to users (ie. Chapter 3. Downloading Large Language models)

Questions to answer:

Are there any code changes to serve a model? Note, the model will already be validated in vllm-ent BEFORE this testing occurs so this is just ensuring the ilab commands work, NOT that vLLM works w/ this model.
- Answer: Yes, and it is documented https://issues.redhat.com/browse/RHELAI-3616
Will this need to be manually done or kicked off by the E2E larger pipeline by Liora's team?

Background & Strategic Fit:

Customers have been asking to leverage the latest and greatest third-party models from Meta, Mistral, Microsoft, Qwen, etc. within Red Hat AI Products. As our they continue to adopt and deploy OS models, the third-party model validation pipeline provides inference performance benchmarking and accuracy evaluations for third-party models to give customers confidence and predictability bringing third-party models to Instruct Lab and vLLM within RHEL AI and RHOAI.

See Red Hat AI Model Validation Strategy Doc

See Redhat Q1 2025 Third Party Model Validation Presentation

clones

RHELAI-3559 [ilab] Running a third-party Llama 3.3 70B Instruct model as teacher model (+ inference functional testing) in ilab tuning flow

In Progress

is cloned by

RHELAI-3607 [ilab] Llama 3.1 8B-Instruct ilab model serve (inference functional testing)

Testing

RHELAI-3596 [ilab] Mistral-Small 3-24B-Instruct ilab model serve (inference functional testing)

Testing

RHELAI-3622 Qwen-2.5 7B-Instruct RHELAI vllm inference flow

Closed

Assignee:: Dan McPherson

Reporter:: Rob Greenberg

Contributors:: Jenny Yi

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/03/09 5:35 PM

Updated:: 2025/05/08 9:06 PM

Model	Quantization Level	Confirmed
Qwen/Qwen2.5-7B-Instruct	Baseline
Qwen/Qwen2.5-7B-Instruct INT4	INT4
Qwen/Qwen2.5-7B-Instruct INT8	INT8
Qwen/Qwen2.5-7B-Instruct FP8	FP8

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates