Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-28656

Mechanism to reserve quota for migration

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • CNV Virtualization
    • False
    • Hide

      None

      Show
      None
    • False
    • ---
    • ---

      Migrated from https://bugzilla.redhat.com/show_bug.cgi?id=2109368

       
      Description of problem:

      Defined ResourceQuata for the namespace as below:

      ~~~
      apiVersion: v1
      kind: ResourceQuota
      metadata:
      name: cnv-quota
      spec:
      hard:
      requests.cpu: "400m"
      ~~~

      Started a VM with cpu request 300m.

      ~~~
      Resource Quotas
      Name: cnv-quota
      Resource Used Hard
      -------- — —
      requests.cpu 300m 400m
      ~~~

      Tried to live migrate the VM and migration failed with the error below because the namespace doesn't have a quota left to create the destination virt-launcher pod:

      ~~~
      1s Warning FailedCreate virtualmachineinstance/rhel8-wild-moth (combined from similar events): Error creating pod: pods "virt-launcher-rhel8-wild-moth-w65h2" is forbidden: exceeded quota: cnv-quota, requested: requests.cpu=300m, used: requests.cpu=300m, limited: requests.cpu=400m
      ~~~

      Version-Release number of selected component (if applicable):

      OpenShift Virtualization 4.10.2

      How reproducible:

      100%

      Steps to Reproduce:

      Please refer above.

      Actual results:

      VM live migration fails with resource limits while creating destination pod.

      Expected results:

      A user cannot raise the limit just to facilitate migration and to calculate the limits considering also live migration overhead is not easy. And if the admin reserves some resource for migration, a normal user may accidentally take up this reserve for general workload and the admin doesn't have control over this.
       
      Jan Chaloupka  2022-08-11 09:36:05 UTC
      > A user cannot raise the limit just to facilitate migration and to calculate the limits considering also live migration overhead is not easy. And if the admin reserves some resource for migration, a normal user may accidentally take up this reserve for general workload and the admin doesn't have control over this.Kubernetes/OCP has no concept of live migration. If the configured quota is not sufficient for the migration, it needs to be temporarily increased to accommodate the increased demand. Given a pod can not be created, the kube-scheduler can not preempt the general workload to reduce the resource consumption. Making priority classes and preemption unusable. Or make any decision since the scheduling phase is performed after a pod is created.

      What is getting requested here is a mechanism for allowing creation of a new pod in the same namespace without changing the hard resource quota while allowing existence of this new pod which together with already existing pods exceeds the hard resource quota. In other words allowing a specific pod to temporarily "escape" from the resource quota constraints to replace the current pod/vm after the live migration is complete. Which is currently impossible to do and against the design principles.

      The closest solution is to use LimitRanges together with label selectors to further restrict which pods can request available resources [2]. Unfortunately, ability to use label selectors (or a different mechanism of selecting pods) with LimitRanges has not been implemented yet.

      [1]
      https://kubernetes.io/docs/concepts/policy/limit-range/
      [2]
      https://github.com/kubernetes/kubernetes/issues/56799
      ResourceQuota/LimitRanges are enforced during admission handling on the apiserver side.
       

            rhn-support-mtessun Martin Tessun
            mfojtik@redhat.com Michal Fojtik
            Antonio Cardace, Barak Mordehai, Stuart Gott, Vladik Romanovsky
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: