Uploaded image for project: 'Red Hat Internal Developer Platform'
  1. Red Hat Internal Developer Platform
  2. RHIDP-9982

Developer Lightspeed Standard Evaluation Dataset Creation

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • 1.9.0
    • None
    • lightspeed
    • None
    • Developer Lightspeed Standard Evaluation Dataset Creation
    • False
    • Hide

      None

      Show
      None
    • False
    • RHDHPLAN-261[Lightspeed] Evaluations - testing accuracy and efficacy across models
    • To Do
    • RHDHPLAN-261 - [Lightspeed] Evaluations - testing accuracy and efficacy across models
    • 100% To Do, 0% In Progress, 0% Done

      Epic Goal

       Develop a comprehensive, standardized Q&A dataset specific to the Developer Lightspeed plugin's knowledge domain, e.g.

      • Backstage
      • Red Hat Developer Hub (RHDH)
      • Kubernetes
      • Openshift
      • CI/CD
      • GitOps
      • Pipelines
      • Developer Portals
      • Deployments
      • Software Catalogs
      • Software Templates
      • Tech Docs
      •  

       

      Lightspeed-core dataset: https://gitlab.cee.redhat.com/lightspeed-core/evaluation-data

       

      •  
      • Why is this important?

      Scenarios

      • Define Dataset Scope: Identify key topics, user personas, and question categories to be covered (e.g., RAG-specific documentation, common RHDH tasks, troubleshooting).
      • Source & Write Q&A Pairs: Collaborate with Subject Matter Experts (SMEs), documentation teams, and product managers to generate a robust list of questions and their "golden" or expected answers.
      • Format Dataset: Convert the Q&A pairs into the eval_data.yaml format required by the Lightspeed Core evaluation tool.
      • Documentation
      • Stretch: Provide instructions for the user to customize the data set to help them to evaluate their model in case of BYOK and BYO MCP in the future

      Acceptance Criteria (Mandatory)

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • ...

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      Open questions::

      •  

      Done Checklist

      • Acceptance criteria are met
      • Non-functional properties of the Feature have been validated (such as performance, resource, UX, security or privacy aspects)
      • User Journey automation is delivered
      • Support and SRE teams are provided with enough skills to support the feature in production environment

              Unassigned Unassigned
              yangcao Stephanie Cao
              RHIDP - AI
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: