-
Epic
-
Resolution: Unresolved
-
Major
-
None
Critical for OpenShift AI
- Add GPU per-hour rate to price list in OCP cost models.
- Support GPU dedicated
- Support nvidia GPUs
- On-premise, itemize cost of GPU, since it's a scarce resource (as compared to cost of CPU or memory or storage)
Notes
- They would have to install the NVIDIA Operator to get the GPU metrics in Prometheus. See https://github.com/NVIDIA/dcgm-exporter.
- https://docs.nvidia.com/datacenter/cloud-native/openshift/23.9.2/time-slicing-gpus-in-openshift.html
- https://docs.openshift.com/container-platform/4.11/monitoring/nvidia-gpu-admin-dashboard.html
Epic Design Document
Feature Brainstorming Document
Kruize Research Documents
- blocks
-
COST-3654 [Case 03473571] RFE: GPU Support in Cost Management
- Closed
- depends on
-
COST-5511 Release operator next with UBI 9, TSDB format, GPU/Namespace metrics
- To Do
- is blocked by
-
COST-5527 Add required GPU metrics to the operator reports for both COST and ROS
- Backlog
- is cloned by
-
COST-5060 Attribute the cost of running AI loads OCP on AWS (GPU)
- Backlog