-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
Quality / Stability / Reliability
-
5
-
False
-
-
False
-
CLOSED
-
CNV Virtualization Sprint 238, CNV Virtualization Sprint 239
-
Important
-
None
Description of problem:
Need empty permittedHostDevices section in HCO CR with Nvidia GPU Operator
spec:
permittedHostDevices: {}
This is required so that, only those host devices/GPUs are permitted to be used in the cluster, which are part of this section.
While configuring Nvidia Drivers via the Legacy approach on Nodes:
The node Capacity and Allocatable sections are updated with the GPU device, only after updating the "permittedHostDevices" section in HCO CR.
While configuring Nvidia Drivers via the Nvidia GPU Operator on Nodes:
The node Capacity and Allocatable sections are updated with the GPU device, even without updating the "permittedHostDevices" section in HCO CR.
Which makes the "permittedHostDevices" section in HCO CR, of no use.
Only if we add an empty spec.permittedHostDevices: {} section in HCO CR, the "permittedHostDevices" checks are honored.
Version-Release number of selected component (if applicable):
4.11.0
How reproducible:
Always
Steps to Reproduce:
1. Configure GPU-PT or vGPU with Nvidia GPU Operator
2.
3.
Actual results:
While configuring Nvidia Drivers via the Nvidia GPU Operator on Nodes:
The node Capacity and Allocatable sections are updated with the GPU device, even without updating the "permittedHostDevices" section in HCO CR.
Which makes the "permittedHostDevices" section in HCO CR, of no use.
Expected results:
Only if we add an empty spec.permittedHostDevices: {} section in HCO CR as default, the "permittedHostDevices" checks are honored.
Additional info:
- external trackers