-
Feature Request
-
Resolution: Unresolved
-
None
Consider this scenario:
- 3 worker nodes with 30G of memory each
- 6 VMs, with 8G RAM each, 2 running on each of the 3 workers.
As below:
NAME AGE PHASE IP NODENAME READY
centos-stream8-empty-whitefish 11m Running 10.128.2.19 worker-green.shift.toca.local True
centos7-loud-bear 13m Running 10.128.2.18 worker-green.shift.toca.local True
fedora-advanced-goose 12m Running 10.131.0.23 worker-blue.shift.toca.local True
fedora-civil-spoonbill 11m Running 10.131.0.24 worker-blue.shift.toca.local True
fedora-frozen-stingray 11m Running 10.129.2.42 worker-red.shift.toca.local True
centos-stream9-useless-raven 13m Running 10.129.2.41 worker-red.shift.toca.local True
All the VMs are as follows:
spec:
domain:
resources:
requests:
memory: 8Gi
So the 3 worker nodes have 16G requested by VMs, plus some more for the other cluster pods, looking like this:
Resource Requests Limits
-------- -------- ------
memory 19858Mi (64%) 0 (0%)
memory 18024Mi (58%) 0 (0%)
memory 20166Mi (64%) 0 (0%)
Eeach of these worker nodes has at least capacity to run another 8G VM, but not a 16G one.
The admin now wants to start a 16G VM. Combined, the cluster has capacity, but not on a single worker.
The admin requests the 16G VM to start. The result is:
NAME AGE PHASE IP NODENAME READY
centos-stream8-empty-whitefish 11m Running 10.128.2.19 worker-green.shift.toca.local True
centos7-loud-bear 13m Running 10.128.2.18 worker-green.shift.toca.local True
fedora-advanced-goose 12m Running 10.131.0.23 worker-blue.shift.toca.local True
fedora-civil-spoonbill 11m Running 10.131.0.24 worker-blue.shift.toca.local True
fedora-frozen-stingray 11m Running 10.129.2.42 worker-red.shift.toca.local True
centos-stream9-useless-raven 13m Running 10.129.2.41 worker-red.shift.toca.local True
rhel7-red-llama 14m Scheduling False
Because:
0/7 nodes are available: 1 node(s) were unschedulable, 3 Insufficient memory, 3 node(s) had taint
, that the pod didn't tolerate.
However, the admin can do this, the solution is a simple live migration of any VM to any other node, so that one node usage goes down to 1 VM (8G) and another one goes to 3VMs (24G), and the 16G VM can start.
For example, live migrate one VM manually and the state quickly goes through these steps without further admin interference:
- oc get vmi
NAME AGE PHASE IP NODENAME READY
centos-stream8-empty-whitefish 29m Running 10.128.2.19 worker-green.shift.toca.local True
centos-stream9-useless-raven 30m Running 10.129.2.41 worker-red.shift.toca.local True
fedora-frozen-stingray 28m Running 10.129.2.42 worker-red.shift.toca.local True
centos7-loud-bear 30m Running 10.131.0.25 worker-blue.shift.toca.local True <--- migrated from Green
fedora-advanced-goose 29m Running 10.131.0.23 worker-blue.shift.toca.local True
fedora-civil-spoonbill 28m Running 10.131.0.24 worker-blue.shift.toca.local True
rhel7-red-llama 26s Scheduled worker-green.shift.toca.local False
Just a few moments later they are all running:
- oc get vmi
NAME AGE PHASE IP NODENAME READY
centos-stream8-empty-whitefish 29m Running 10.128.2.19 worker-green.shift.toca.local True
centos-stream9-useless-raven 30m Running 10.129.2.41 worker-red.shift.toca.local True
centos7-loud-bear 30m Running 10.131.0.25 worker-blue.shift.toca.local True
fedora-advanced-goose 29m Running 10.131.0.23 worker-blue.shift.toca.local True
fedora-civil-spoonbill 28m Running 10.131.0.24 worker-blue.shift.toca.local True
fedora-frozen-stingray 28m Running 10.129.2.42 worker-red.shift.toca.local True
rhel7-red-llama 34s Running 10.128.2.22 worker-green.shift.toca.local True
Request: the admin would like to have an option for this to work automatically, without admin interference to solve this problem.
I did some research on PriorityClasses, but it seems they are wiped from VM spec. And even if they worked, it would create an ever increase priority problem so I cannot find a solution for this with currently available mechanisms.
The customer is requesting this for VMs, not Pods, even though a generic solution could work for both.
Finally, the example above is for (re)starting a VM (i.e. highly available one), but the same logic is requested for other tasks such as draining nodes.
- is related to
-
CNV-44480 Support for IBM Turbonomic
- New
- external trackers