• Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • 1.9.0
    • None
    • model-catalog
    • None
    • Model Benchmarking & Baseline Establishment
    • False
    • Hide

      None

      Show
      None
    • False
    • RHDHPLAN-261[Lightspeed] Evaluations - testing accuracy and efficacy across models
    • In Progress
    • RHDHPLAN-261 - [Lightspeed] Evaluations - testing accuracy and efficacy across models
    • 33% To Do, 33% In Progress, 33% Done

      Epic Goal

      Run the evaluation suite against various models to establish a baseline accuracy and provide recommendations

      Note: Since we only have limited number models are accessible for comparison, and also the large Q&A set is generated by AI without full manual reviews, the evaluation result and the accuracy number at current stage is only for internal reference.

      Why is this important?

      Scenarios

      • Identify Candidate Models: Select the models for testing, ensuring at least one medium/large model (for cluster) and one small model (for local).
      • Analyze Results: Collect and analyze the accuracy reports from all model tests.
      • Publish Recommendations:

      Document and publish the baseline accuracy numbers internally and the recommended models for both cluster and local use.

      Acceptance Criteria (Mandatory)

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • ...

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      Open questions::

      •  

      Done Checklist

      • Acceptance criteria are met
      • Non-functional properties of the Feature have been validated (such as performance, resource, UX, security or privacy aspects)
      • User Journey automation is delivered
      • Support and SRE teams are provided with enough skills to support the feature in production environment

              yangcao Stephanie Cao
              yangcao Stephanie Cao
              RHDH AI
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: