-
Feature
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
False
-
-
False
-
Not Selected
Feature Overview
This work aims to improve the RAG evaluation framework to work with a local LLM.
The local LLM can be from existing RHEL AI models, such as Mistral-7b, Mixtral, or Granite-3.0. Alternatively, we should fine-tune a model for evaluating (RAG) metrics.
This improvement will enable RAG evaluation to work on air-gapped environments and decouple dependency on external API services.
Goals
- Improve the usability of RAG evaluations by using a local LLM or SLM.
- Improve the efficiency and cost of RAG metric evaluations by utilizing a local model.
- Ensure compliance with Apache 2.0 licensing for any fine-tuned models.
Requirements
- Successful integration of a local LLM or SLM into the RAG evaluation framework.
- Compliance with Apache 2.0 licensing for any fine-tuned models.
- Improved or maintained equivalent accuracy and efficiency in evaluating RAG metrics compared to external API services.
Background
The current evaluation framework relies on external APIs for language model evaluations, which can be costly and may not always provide the desired level of accuracy for specialized evaluations. By integrating a local LLM or SLM, we can improve the accuracy and efficiency of our RAG evaluations while also reducing costs for our users.
Done
- [ ] Local LLM or SLM integration into the evaluation framework.
- [ ] Compliance with Apache 2.0 licensing for any fine-tuned models.
- [ ] Equivalent or Improved accuracy and efficiency in evaluating RAG metrics that when using hosted API
Questions to Answer
- Can existing RHEL AI models be used for RAG evaluations?
- What are the legal considerations (e.g., generating vs obtaining a training dataset) for fine-tuning an SLM for local model evaluations?
- How will we ensure the compatibility of the RAG eval framework with hosted and local LLM or SLM?
Out of Scope
- The integration of external APIs for RAG model evaluations.
- The development of a new language model from scratch.
*Customer Considerations
- Ensure that the chosen LLM or SLM meets the desired evaluation accuracy and efficiency requirements.
- Consider the cost vs accuracy implications of using a local LLM or SLM compared to external APIs.
- Ensure compliance with relevant data privacy and security regulations when fine-tuning an SLM.
- depends on
-
RHELAI-2309 [eval] RAG Evaluation Framework and Metrics
- New
-
RHELAI-2374 [eval] Extending RAGAS Evaluation Framework with Additional Metrics
- New