Loading...

XML

Word

Printable

Type: Initiative
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Model Validation
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

Description:{}

Enable performance benchmarking of a single deployed model across multiple versions of the vLLM inference engine. This capability is essential for evaluating engine version regressions, improvements, and compatibility under real-world load.

The system should allow users (e.g.,MLE) to:

Define a list of vLLM versions to benchmark (e.g., v0.2.4, v0.3.1, v0.4.0, main)

Run performance benchmarks against the same model using identical workload settings

{}{}Goal: {}Provide clear, comparable performance metrics across vLLM versions to support upgrade decisions, regression detection, and engine tuning. This task complements multi-config testing and enables fine-grained engine evolution analysis.

Assignee:: Unassigned

Reporter:: Aviran Badli

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/06/26 6:39 PM

Updated:: 2025/07/01 8:07 AM

Resolved:: 2025/07/01 8:07 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty