Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: CNV Virt-Cluster, CNV Virtualization
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Component Fix Version(s):
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

PX Impact Score:

Market:

1. Problem Statement

Citrix Virtual Apps & Desktops (VDI) requires a consistent and predictable method for configuring GPU resources when creating VMs from a Master VM template.
Today, GPU configuration in OpenShift Virtualization is vendor-specific (NVIDIA vs. AMD vs. Intel), resource-specific (passthrough vs. mediated/vGPU), and expressed through multiple YAML fields, making it extremely challenging for Citrix to support all hardware types.

Citrix needs a single, vendor-neutral GPU abstraction, similar to Kubernetes StorageClass, so their product can provision GPUs without learning each vendor’s details.

2. Why This Is Needed

Citrix must support on-prem, cloud, multi-vendor, and multi-GPU-type deployments.

The current OpenShift Virtualization GPU configuration requires them to parse and override:

spec.domain.devices.gpus

pciHostDevices

mediatedDevices

resourceName (e.g., nvidia.com/mig-2g.10gb, amd.com/gpu)

NodeSelectors (vendor-specific)

This is not maintainable across customer environments. Enterprises expect GPU provisioning to work like storage classes: simple and vendor-agnostic.

3. Proposed Solution (High-Level)

Introduce a GPUClass abstraction that allows users (or Citrix) to specify GPU requirements using a single simple field:

gpuClass: "gpu.vdi.medium

4. Functional Requirements

A new resource type GPUClass, must allow defining GPU profiles decoupled from hardware/vendor specifics.
OpenShift Virtualization must map GPUClass → actual hardware configuration (passthrough, vGPU/mdev, MIG slice, resourceName, selectors).
Citrix Master VM GPU config must be preserved or overridden via Machine Profile using GPUClass.
The GPUClass abstraction must work across NVIDIA/AMD/Intel

5. Acceptance Criteria

AC1: A user can define one or more GPUClass resources.

AC2: A VM using gpuClass: X is scheduled and configured correctly across hardware vendors.

AC3: Citrix Machine Profile can override master VM GPUClass cleanly.

AC4: System correctly handles fallback when hardware does not match GPUClass.

AC5: Documentation includes examples for NVIDIA, AMD, Intel, and cloud GPUs.{}

6. Non- goals

This RFE does NOT require redesigning device plugins.

This RFE does NOT replace vendor drivers or MIG tooling.

This RFE does NOT define performance guarantees.

This RFE does NOT introduce a new scheduler, only an abstraction layer.

Assignee:: Kedar Bidarkar

Reporter:: Sudhakar Molli

QA Contact:: Kedar Bidarkar

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/11/24 6:08 PM

Updated:: 2025/11/24 7:09 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates