-
Epic
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
None
-
Deploy dra-example-driver on OpenShift
-
In Progress
-
None
-
80% To Do, 20% In Progress, 0% Done
-
False
-
-
False
-
Not Selected
-
None
-
None
-
None
Goal
- Deploy the upstream kubernetes-sigs/dra-example-driver on OpenShift as a hardware-independent reference DRA driver for certifying DRA features
- Contribute OpenShift support (SCC, UBI image, deployment docs) upstream
Why is this important?
- Decouples DRA feature certification from third-party drivers and hardware availability
- The dra-example-driver is the upstream sig-node reference implementation — it simulates GPU devices via environment variables, no real hardware needed
- Enables repeatable DRA testing in CI without GPU nodes
- We can still test on third-party drivers (e.g., NVIDIA), but this gives a reliable baseline that is not tied to vendor release cycles
Scenarios
- Engineer deploys dra-example-driver via Helm on an OCP 4.21+ cluster and verifies kubelet plugin pods are running on all nodes
- Engineer creates ResourceClaims, DeviceClasses, and validates device allocation/scheduling using mock GPU devices
- CI job deploys the driver and runs DRA e2e tests on every PR/nightly without requiring GPU hardware
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents
- dra-example-driver deploys and runs on OCP 4.21+ (K8s 1.35) with DRA enabled by default
- Helm chart includes OpenShift SCC for the kubelet plugin DaemonSet
- Container image built on UBI base image
- Core DRA workflows validated: ResourceClaim creation, device allocation, pod scheduling
- Changes contributed upstream to kubernetes-sigs/dra-example-driver
Dependencies (internal and external)
- DRA enabled by default in OCP 4.21 (OCPNODE-3895 — already merged)
- kubernetes-sigs/dra-example-driver upstream repo: https://github.com/kubernetes-sigs/dra-example-driver
- cert-manager (optional, only if webhook validation is needed)
Previous Work
- OCPNODE-4079: Implement partitionable devices support in dra-example-driver
- OCPNODE-4043: e2e tests to validate downstream DRA APIs with NVIDIA GPU
- https://github.com/openshift/api/pull/2498 — DRA feature gate enabled by default
Open questions
- Should we create a new OCPNODE component for dra-example-driver or reuse an existing one?
- Webhook: use cert-manager or OpenShift service-serving certificates?
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement - <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to PR on kubernetes-sigs/dra-example-driver>
- DEV - Upstream documentation merged: <link>
- QE - Test plans in Polarion: <link>
- QE - Automated tests merged: <link>
- DOC - Downstream documentation merged: <link>