Feature Overview (aka. Goal Summary)

The evaluation of the Lightspeed plugin is essential to understand the accuracy of the responses of the Lightspeed plugin as well as provide a way for us to understand the effectiveness of the models being used. With the evaluation framework in 1.9 (https://issues.redhat.com/browse/RHDHPLAN-261), we want the user to be able to reproduce the evaluation using the our provided data as well as view the evaluation data that we have produced in each release. These information will help the user to select the model of their choice when running the RHDH Lightspeed.

Goals (aka. expected user outcomes)

Reproduce the evaluation: Provides the documentation to inform the user on the evaluation framework and how to run the evaluation.

Access to evaluation Q&A dataset: Provides access to a downloadable set of Q&A data(for RAG content evaluation only) that we use to produce the evaluation data that we provide. There will be an official list of Q&A data set that we produce for each release. And the download of the Q&A data set should clearly indicated that the Q&A are AI-generated based on the product documentation.

Access to the evaluation result: Provide access to the evaluation result report that we created for each release. The list of selected models for the reports are decided and updated in each release.

Requirements (aka. Acceptance Criteria):

Documentation: Provides the documentation to help users to understand:
1. the evaluation framework
2. instructions on how to download, set up, and run the RHDH Lightspeed evaluation tools.
3. instructions on how to download the Q&A data set for each release
4. link to the reports on the evaluation result report on each RHDH release
5. include information on how to interpret the results and the information on the evaluation result categories.

Q&A Data set: The Q&A data set are AI-generated based on the corresponding RHDH release of the product documentation. No human verification will be provided.

Allow off-release cycle data release: The documentation should provide link to the centralized location where user can find the Q&A data set and evaluation result report for each release. This allows those data can be published post release after the RAG content has been updated.

Out of Scope (Optional)

This feature only covers the publishing of the Q&A data set and evaluation result for the 1.8 and 1.9 releases. We'll create a dedicated off-release cycle feature for publishing the Q&A data set and evaluation result for each future release.

Customer Considerations (Optional)

Customer should be able to pick their own model and run the evaluation with our provided Q&A data set. This will allow users to select the model server of their choice.

Documentation Considerations

See the details in the documentation section under Goals.

Assignee:: Stephanie Cao

Reporter:: Elson Yuen

Team:: RHDH AI

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2026/01/23 8:34 PM

Updated:: 2026/02/04 9:21 PM

Details

Description

Feature Overview (aka. Goal Summary)

Goals (aka. expected user outcomes)

Requirements (aka. Acceptance Criteria):

Out of Scope (Optional)

Customer Considerations (Optional)

Documentation Considerations

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty