Story

Run Lightspeed eval tool against Developer Lightspeed 1.9
the evaluation will run 2-3 large/medium models and 2 small models to compare with
the evaluation result for each model will be standard, the result will be collected & internally published as part of this issue.

Background

With the evaluation framework in 1.9 (https://issues.redhat.com/browse/RHDHPLAN-261), we also want to run the evaluation against Developer Lightspeed 1.9 release.

Dependencies and Blockers

https://issues.redhat.com/browse/RHIDP-11530 need to be done, so we have the dataset for running the evaluation

Acceptance Criteria

Select the models for testing, run 2-3 medium/large models and 2-3 small models for evaluation.
3 large/medium models

Gemini-2.5-pro
Gpt-oss:120b
llama4:scout
2 small models
Llama3:8b
Gemini-2.5-flash-lite

A standard for reports should be created as part of this work.
Reports should be generated in this standard format & internally published

Assignee:: Unassigned

Reporter:: Stephanie Cao

Team:: RHDH AI

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2026/01/23 10:49 PM

Updated:: 2026/02/18 3:44 PM

Details

Description

Story

Background

Dependencies and Blockers

Acceptance Criteria

Attachments

Easy Agile Planning Poker

Activity

People

Dates