-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
0.42
-
False
-
-
False
-
NEW
-
---
-
---
-
-
Medium
-
No
Description of problem:
VM using vGPU device shows ErrorUnschedulable but then is able to schedule. This takes 3-5 sec to schedule after the error is shown. Then VM is able to successfully schedule on node (no modifications are made during that time)
Version-Release number of selected component (if applicable):
4.14
How reproducible:
100%
Steps to Reproduce:
1.Update HCO with relevant vgpu config
2.Create VM with configured vgpu device in spec
3.Start VM
Actual results:
VM shows ErrorUnschedulable then continues to start without issues
Expected results:
VM is scheduled without any errors
Additional info:
VM status change:
virt-gpu-vgpu-test-rhel-vm-with-vgpu rhel-vgpu-gpus-spec-vm-1696928322-915947 2s Starting False
virt-gpu-vgpu-test-rhel-vm-with-vgpu rhel-vgpu-gpus-spec-vm-1696928322-915947 2s Starting False
virt-gpu-vgpu-test-rhel-vm-with-vgpu rhel-vgpu-gpus-spec-vm-1696928322-915947 2s ErrorUnschedulable False
virt-gpu-vgpu-test-rhel-vm-with-vgpu rhel-vgpu-gpus-spec-vm-1696928322-915947 9s Starting False
virt-launcher pod event:
50s Warning FailedScheduling pod/virt-launcher-rhel-vgpu-gpus-spec-vm-1696928322-915947-h5bcr 0/3 nodes are available: 1 Insufficient nvidia.com/GRID_A2_2Q, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling..
42s Normal Scheduled pod/virt-launcher-rhel-vgpu-gpus-spec-vm-1696928322-915947-h5bcr Successfully assigned virt-gpu-vgpu-test-rhel-vm-with-vgpu/virt-launcher-rhel-vgpu-gpus-spec-vm-1696928322-915947-h5bcr to cnv-qe-infra-03.cnvqe3.lab.eng.rdu2.redhat.com