Loading...

XML

Word

Printable

Type: Feature
Resolution: Done
Priority: Major
Fix Version/s: rhelai-1.4
Affects Version/s: None
Component/s: Instructlab - Research, InstructLab - Training
Labels:
- 1.4-candidate

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Color Status:
Yellow
Hierarchy Progress Bar:

50% To Do, 0% In Progress, 50% Done
Status Summary:

Hide

Targetting at least TP for RHEL AI 1.4 is conditional on the availability of the 128k base models.

Jan16 - (Aileen) Adding the At risk color status as it's still in refinement a week before dev freeze and half (2) of the child issues are still in the New state.

Show
Targetting at least TP for RHEL AI 1.4 is conditional on the availability of the 128k base models. Jan16 - (Aileen) Adding the At risk color status as it's still in refinement a week before dev freeze and half (2) of the child issues are still in the New state.

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

Feature Overview:

This is phase 1 for the Support Granite 3.1 model with 128k context window

Goal:

Fine-tune the 128k model using the default 8b 4k dataset (this might affect the effective context windows size)
Identify the effective context window to be used as the supported context window
- The expectation is at least 32K context window to match the context window of the teacher model.
- The ideal range would be at least 64K.

Requirements:

Fine-tune the Granite 3.1 128k context model (Dec 2024) with the 8b 4k dataset
Validate and document effective context window
Identify any deviation in the performance of the final model
- To move as GA, it should be within the margin of error
Identify optimal batch size for training

Done - Acceptance Criteria:

[ ] InstructLab can fine-tune the 128k model using the 8b 4k dataset
[ ] Document the effective context window of the resulting model
[ ] Evaluate and compare the performance of the final model to a 4k fine-tuned model
[ ] Document and use optimal batch size during training (if required)

Use Cases:

Enterprise use cases that would benefit from a large context window include the following:

RAG
Summarization
Code generation
Tools use
Advanced reasoning

Out of Scope :

For phase 1 the creation of the 8b 128k dataset or SDG optimizations for large context windows are out of scope.

Documentation Considerations:

Document the support of large context window limits based on effective context window size.

Questions to Answer:

Can the fine-tuning of the 128k be done with the existing batch size for phase 1 and then updated for phase 2, or does it require modification for phase 1?{}{}
Can we default to the 128k model, or are there circumstances in which we should maintain the 4k model?

Background and Strategic Fit:

To support the enterprise use cases required by customers, we need at least a 32k effective context window.

Customer Considerations:

These changes should be transparent to the user-facing CLI flow

is cloned by

RHELAI-2670 Productize the 128k context window Granite v3.1 (w/32k effective window)

In Progress

mentioned on

Merge request - [rhel-ai] granite 3.1 model RPA typo fix

Assignee:: William Caban

Reporter:: William Caban

Contributors:: Mustafa Eyceoz, Oleg Silkin

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2024/12/10 11:20 PM

Updated:: 2025/02/26 4:28 AM

Resolved:: 2025/02/26 4:24 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates