-
Epic
-
Resolution: Done
-
Normal
-
None
-
None
-
Comparing GPU vs vGPU Performance in RHEL/Openshift
-
-
Not Selected
-
False
-
False
-
None
-
PSAP Sprint 223, PSAP Sprint 221, PSAP Sprint 222, PSAP Sprint 223, PSAP Sprint 224, PSAP Sprint 225
-
0% To Do, 0% In Progress, 100% Done
Epic Goal
- To use ML benchmarks to assess the performance of vGPUs in VMs on RHEL, as well as in openshift virtualization, in comparison to direct GPU use.
Why is this important?
- Currently unknown/undocumented information, potentially desirable to customer(s)
Scenarios
- Running mlperf SSD + SSDv2 training benchmarks and nvidiadl BERT benchmark on baremetal RHEL8, single GPU
- Starting a single VM on RHEL8 w/ a single vGPU (full capacity), running same benchmarks
- Running multiple workloads on baremetal w/ single GPU, running same benchmarks
- Starting multiple VMs (same as step 3 amount of workloads) on RHEL8, each with a vGPU, running same benchmarks
- Adding SNO to baremetal, running benchmarks in openshift w/ single GPU
- Adding SNO to the single/multi VM environments, running same benchmarks in openshift
- Using openshift virtualization to test VM/vGPU performance within openshift using SNO on baremetal
Acceptance Criteria
- Document for performance report covering all scenarios
- Potential presentation on results found
- Automation code (if needed) cleaned, documented, and checked into repo
Dependencies (internal and external)
- ...
Previous Work (Optional):
- …
Open questions::
- …