-
Feature
-
Resolution: Unresolved
-
Major
-
None
-
False
-
-
False
-
Not Selected
-
100% To Do, 0% In Progress, 0% Done
-
Feature Overview:
This is phase 1 for the Support Granite 3.1 model with 128k context window
Goal:
- Fine-tune the 128k model using the default 8b 4k dataset (this might affect the effective context windows size)
- Identify the effective context window to be used as the supported context window
- The expectation is at least 32K context window to match the context window of the teacher model.
- The ideal range would be at least 64K.
Requirements:
- Fine-tune the Granite 3.1 128k context model (Dec 2024) with the 8b 4k dataset
- Validate and document effective context window
- Identify any deviation in the performance of the final model
- To move as GA, it should be within the margin of error
- Identify optimal batch size for training
Done - Acceptance Criteria:
- [ ] InstructLab can fine-tune the 128k model using the 8b 4k dataset
- [ ] Document the effective context window of the resulting model
- [ ] Evaluate and compare the performance of the final model to a 4k fine-tuned model
- [ ] Document and use optimal batch size during training (if required)
Use Cases:
Enterprise use cases that would benefit from a large context window include the following:
- RAG
- Summarization
- Code generation
- Tools use
- Advanced reasoning
Out of Scope :
For phase 1 the creation of the 8b 128k dataset or SDG optimizations for large context windows are out of scope.
Documentation Considerations:
Document the support of large context window limits based on effective context window size.
Questions to Answer:
- Can the fine-tuning of the 128k be done with the existing batch size for phase 1 and then updated for phase 2, or does it require modification for phase 1?{}{}
- Can we default to the 128k model, or are there circumstances in which we should maintain the 4k model?
Background and Strategic Fit:
To support the enterprise use cases required by customers, we need at least a 32k effective context window.
Customer Considerations:
- These changes should be transparent to the user-facing CLI flow
- is cloned by
-
RHELAI-2670 (phase 2) Productize the 128k context window Granite v3.1
- New