-
Feature
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
False
-
-
False
-
Not Selected
Feature Overview
This functionality aims to create a RAG (Retrieva.-Agumented Generation) evaluation framework or metrics. The framework will be capable of working with one of the RHEL AI existing models, such as Mistral-7b, Mixtral, or Granite-3.0, or another Apache 2.0 model that can be cleared with legal.
Goals
- Identify and adapt RAG evaluation metrics to work with one of the existing models or another Apache 2.0 module.
- Ensure the evaluation framework can run evaluations with local LLM
- Research and adapt RAG Eval frameworks that can be adapted for this functionality, such as RAGAS, LlamaIndex RAG evaluators, TrueLens Eval, RAGEval, and Massive Text Embedding Benchmark (MTEB).
Requirements
- The evaluation framework must be Apache 2.0, MIT, or unrestricted Open Source license.
- The framework must be capable of running evaluations with a local LLM.
- RAG Eval frameworks must be adapted to work with local/air-gapped environment
Background
The RAG evaluation framework will be used to assess the performance of RAN and evaluate the performance of a fine-tuned model with and without RAG.
The framework will provide insights into the relevance, and accuracy of the RAG.
Done
- [ ] The evaluation framework has been adapted to work with one of the existing models or another Apache 2.0 module.
- [ ] The evaluation framework can run evaluations with a local LLM.
- [ ] Evaluation metrics are reported
Questions to Answer
Out of Scope
- [ ] The development of a new LLM for RAG evals is out of scope for this Feature.
- [ ] The integration of the evaluation framework with other systems or tools is out of scope for this Feature.
Customer Considerations
- [ ] The evaluation framework must be capable of providing accurate and reliable results relevant to RAG.
- [ ] The framework must describe metrics in a way that is easy to use and understand for the end-users.
- is depended on by
-
RHELAI-2375 [eval] Local LLM for RAG Evaluation Framework
- New
- split to
-
RHELAI-2374 [eval] Extending RAGAS Evaluation Framework with Additional Metrics
- New