-
Epic
-
Resolution: Unresolved
-
Medium
-
None
-
None
-
None
-
Nvidia driver with Confidential Support in PodVM
-
Product / Portfolio Work
-
False
-
-
False
-
Not Selected
-
In Progress
-
KATA-2620 - Protect confidentiality and integrity of GPU-supported AI workloads in use
-
-
55% To Do, 18% In Progress, 27% Done
Nvidia GPU support in Confidential containers requires to have a podVM built with all the relevant bits in place.
Epic Goal
- Having a PodVM that can be used with Confidential GPU instance and use the HW from within the workload
Why is this important?
- This is required for AI/ML workload using CoCo (and peer-pods)
Scenarios
- As an OpenShift administrator, I want to have a ready to use podVM image that can be configured with Confidential GPU instance
- As an OpenShift user, I want to deploy my GPU workload in confidential instance and use it with my workload.
- placeholder for remote attestation support
Acceptance Criteria
(The Epic is complete when...)
- RHEL 10 based PodVM works with Confidential GPU instance (e.g. in Azure- Standard_NCC40ads_H100_v5), workload can be used from within it (running nvidia-smi show it's in confidential mode)
- The creation of the podVM image is embedded in the pipeline.
- placeholder for using in BM also
Additional information
- Driver has to be from version 570.172.08 or above
- Base RHEL10 podvm to support the 6.9+ kernel requirement (as discussed)
- is depended on by
-
KATA-4074 NVIDIA GTC demo using VLLM server
-
- Closed
-