-
Initiative
-
Resolution: Duplicate
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
Extend JBenchmark to support benchmarking across a diverse set of cloud instance types to evaluate how generative AI models perform under various hardware environments. This enables teams to make data-driven deployment decisions based on performance, cost, and scalability tradeoffs across clouds and GPU SKUs.
This task includes:
- Benchmark orchestration on instances from GCP, AWS, Azure, and on-prem
- Support for different GPU models (e.g., A100, H100, L4, AMD MI300, IBM SPU, etc.)
- Capturing full instance metadata (cloud vendor, machine type, cost/hour, region, etc.)
- Aggregating results by instance family to enable performance/cost comparisons
Goals:
- Provide customers and internal teams with comparative metrics across:
-
- Different cloud providers
-
- GPU types
-
- Instance shapes (single-GPU, multi-GPU, CPU fallback, etc.)
- Validate models and inference configurations in environments aligned with real-world deployments
Acceptance Criteria:
- Benchmark can be triggered across at least 3 major cloud providers
- Each run is tagged with cloud/instance metadata