Loading...

XML

Word

Printable

Type: Initiative
Resolution: Duplicate
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Model Validation
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

Extend the JBenchmark system to support structured benchmarking across multiple llm-d configuration scenarios. This task aims to evaluate how different llm-d runtime settings affect model performance, stability, and resource efficiency when integrated with GuideLLM.

This includes the ability to:

Benchmark under different router configurations (e.g., batching strategy, latency targets)

Evaluate various Placement/Dispatch (P/D) strategies (e.g., static vs. dynamic node selection, GPU/resource awareness)

Run over different GuideLLM datasets to simulate a variety of enterprise use cases

Capture rich metadata for every benchmark run to enable reproducible comparisons

Examples of Configurable Parameters:

Router strategies (e.g., round-robin, latency-aware)

Batching configurations and max concurrency

P/D policies and node-affinity rules

Dataset variability (prompt types, lengths, formats)

Acceptance Criteria:

Benchmark runs can be parameterized with different llm-d settings

All runs are tagged with full config metadata

Results are stored, queryable, and comparable via the benchmarking dashboard

Significant configuration impacts are highlighted in reports for GuideLLM/llm-d stakeholders

Assignee:: Aviran Badli

Reporter:: Aviran Badli

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/06/26 6:49 PM

Updated:: 2025/08/10 7:16 PM

Resolved:: 2025/08/10 7:16 PM

Details

Description

Examples of Configurable Parameters:

Acceptance Criteria:

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty