Loading...

XML

Word

Printable

Type: Initiative
Resolution: Duplicate
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: Model Validation
Labels:
- model-validation

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

This epic focuses on establishing a reliable, automated end-to-end and integration testing framework for the JBenchmark platform. The purpose is to ensure that every part of the benchmarking pipeline — from input to database — functions correctly and consistently, across configurations and environments.

This effort will lay the quality foundation for Red Hat onboarding, providing engineering-grade confidence in deployments, while enabling fast and safe iteration.

The suite will validate:

Pipeline configuration logic (via Kubernetes/Argo resource inspection)

Run-time execution and output correctness (via full-flow benchmark tests)

System behavior across cache, warmup, profiling, and machine configs

Tests are integrated directly into the CI/CD GitHub pipeline, triggered on every pull request. Developers will receive fast feedback and the ability to catch regressions before merging, significantly reducing debugging time and increasing trust in the system.

Test Categories:

- Cache logic

- Machine/gpu configuration

- Profiler arguments

- Warmup logic

- vLLM pod configuration

- JSON-to-DB integrity (raw result)

Full End-to-End (Happy Flow) Tests
Longer-running tests that validate numerical results against trusted historical snapshots, ensuring both success status and data accuracy.

Definition of Done (DoD):

All test categories above are implemented and run successfully in CI.

All benchmark scenarios use trusted parameters and assert correctness across pods, resources, and database entries.

Remaining items (e.g., test for statuses, expanders, multiscenario configs, saturation) are tracked as TODOs and attached to this epic.

Assignee:: Aviran Badli

Reporter:: Aviran Badli

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/06/22 6:53 PM

Updated:: 2025/08/10 7:08 PM

Resolved:: 2025/08/10 7:08 PM

Details

Description

Test Categories:

Definition of Done (DoD):

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty