Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: InstructLab - Evaluation, InstructLab - RAG
Labels:

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Color Status:
Not Selected

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Intelligence Requested:
Market:

Feature Overview

This work aims to improve the RAG evaluation framework to work with a local LLM.

The local LLM can be from existing RHEL AI models, such as Mistral-7b, Mixtral, or Granite-3.0. Alternatively, we should fine-tune a model for evaluating (RAG) metrics.

This improvement will enable RAG evaluation to work on air-gapped environments and decouple dependency on external API services.

Goals

Improve the usability of RAG evaluations by using a local LLM or SLM.
Improve the efficiency and cost of RAG metric evaluations by utilizing a local model.
Ensure compliance with Apache 2.0 licensing for any fine-tuned models.

Requirements

Successful integration of a local LLM or SLM into the RAG evaluation framework.
Compliance with Apache 2.0 licensing for any fine-tuned models.
Improved or maintained equivalent accuracy and efficiency in evaluating RAG metrics compared to external API services.

Background

The current evaluation framework relies on external APIs for language model evaluations, which can be costly and may not always provide the desired level of accuracy for specialized evaluations. By integrating a local LLM or SLM, we can improve the accuracy and efficiency of our RAG evaluations while also reducing costs for our users.

Done

[ ] Local LLM or SLM integration into the evaluation framework.
[ ] Compliance with Apache 2.0 licensing for any fine-tuned models.
[ ] Equivalent or Improved accuracy and efficiency in evaluating RAG metrics that when using hosted API

Questions to Answer

Can existing RHEL AI models be used for RAG evaluations?
What are the legal considerations (e.g., generating vs obtaining a training dataset) for fine-tuning an SLM for local model evaluations?
How will we ensure the compatibility of the RAG eval framework with hosted and local LLM or SLM?

Out of Scope

The integration of external APIs for RAG model evaluations.
The development of a new language model from scratch.

*Customer Considerations

Ensure that the chosen LLM or SLM meets the desired evaluation accuracy and efficiency requirements.
Consider the cost vs accuracy implications of using a local LLM or SLM compared to external APIs.
Ensure compliance with relevant data privacy and security regulations when fine-tuning an SLM.

depends on

RHELAI-2309 [eval] RAG Evaluation Framework and Metrics

RHELAI-2374 [eval] Extending RAGAS Evaluation Framework with Additional Metrics

Assignee:: William Caban

Reporter:: William Caban

Contributors:: Ilan Pinto, Ilya Kolchinsky, Oleg Silkin

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2024/11/24 1:44 AM

Updated:: 2024/12/03 3:46 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates