Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-3167

Support Benchmarking Across Diverse llm-d Configurations

    • Icon: Initiative Initiative
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • None
    • Model Validation
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      Extend the JBenchmark system to support structured benchmarking across multiple llm-d configuration scenarios. This task aims to evaluate how different llm-d runtime settings affect model performance, stability, and resource efficiency when integrated with GuideLLM.

       

      This includes the ability to:

      • Benchmark under different router configurations (e.g., batching strategy, latency targets)
      • Evaluate various Placement/Dispatch (P/D) strategies (e.g., static vs. dynamic node selection, GPU/resource awareness)
      • Run over different GuideLLM datasets to simulate a variety of enterprise use cases
      • Capture rich metadata for every benchmark run to enable reproducible comparisons

       

       

      Examples of Configurable Parameters:

      • Router strategies (e.g., round-robin, latency-aware)
      • Batching configurations and max concurrency
      • P/D policies and node-affinity rules
      • Dataset variability (prompt types, lengths, formats)

       

      Acceptance Criteria:

      • Benchmark runs can be parameterized with different llm-d settings
      • All runs are tagged with full config metadata
      • Results are stored, queryable, and comparable via the benchmarking dashboard
      • Significant configuration impacts are highlighted in reports for GuideLLM/llm-d stakeholders

              rh-ee-abadli Aviran Badli (Inactive)
              rh-ee-abadli Aviran Badli (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: