Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-30333

[2217910] [cnv-4.13] kubevirt should allow runtimeclass to be configured in a pod

XMLWordPrintable

    • CNV I/U Operators Sprint 238
    • High
    • None

      +++ This bug was initially created as a clone of Bug #2203291 +++

      Description of problem:

      For DPDK type applications, the vCPU should not be interrupted or throttled
      by the cgroup cpu quota limitations. By default k8s sets cpu quota limitations to positive
      integer values, which throttles the vCPU.

      To disable cpu quota for pods it is necessary to annotate the pod with

      cpu-quota.crio.io: "disable"

      And set runtimeClassName to the performance profile runtimeClassName (as described
      at "Disabling CPU CFS quota" section of https://docs.openshift.com/container-platform/4.12/scalability_and_performance/cnf-low-latency-tuning.html).

      However KubeVirt does not support setting of runtimeClassName.

      In a discussion with Vladik, it appears an acceptable way to allow pods to
      set runtimeClassName would be for a scheduling policy to be created for VMs,
      similarly to migration policies.

      Version-Release number of selected component (if applicable):

      4.12

      How reproducible:

      Always

      Steps to Reproduce:
      1. Start KubeVirt VM with cpu-quota.crio.io: "disable" annotation and runtimeclassname set (per cnf low latency tuning document above).
      2.
      3.

      Actual results:

      cpu.cpu_quota_us value in the pod cgroup is not -1.

      Expected results:

      cpu.cpu_quota_us value in the pod cgroup is -1.

      Additional info:

      — Additional comment from Marcelo Tosatti on 2023-05-11 21:36:01 CEST —

      Stu,

      I've discussed this with Vladik and he has an idea for a solution.

      The impact is that telco type workloads which must provide
      zero packet loss fail to meet that requirement (including iDirect,
      a customer currently doing a PoC).

      — Additional comment from Marcelo Tosatti on 2023-05-12 14:25:02 CEST —

      — Additional comment from on 2023-05-17 14:43:41 CEST —

      Re-assigning this to the networking component because it's something they're planning to address. Please feel free to re-assign back to virt if that's more appropriate.

      This appears to be a duplicate of this epic:

      https://issues.redhat.com/browse/CNV-24676

      — Additional comment from Marcelo Tosatti on 2023-05-17 15:07:57 CEST —

      (In reply to sgott from comment #3)
      > Re-assigning this to the networking component because it's something they're
      > planning to address. Please feel free to re-assign back to virt if that's
      > more appropriate.
      >
      > This appears to be a duplicate of this epic:
      >
      > https://issues.redhat.com/browse/CNV-24676

      Discussed this with Petr and the issue he is planning to address
      is a separate one (about the problem of the HCO workaround being
      unsupported).

      Reassigning to virt.

      — Additional comment from Petr Horáček on 2023-05-17 15:39:34 CEST —

      Sorry for the misunderstanding. What I meant is that we plan to expose the global defaultRuntimeClass attribute of KubeVirt on HCO first. This short-term solution would allow us to productize DPDK through supported API.

      After that, with a lower priority, we would still like to implement this per-VM API (tracked on Jira https://issues.redhat.com/browse/CNV-24676), allowing customers to mix their performance and regular workloads.

      — Additional comment from Kedar Bidarkar on 2023-05-31 14:14:05 CEST —

      As per Petr from comment5 it appears it needs update in HCO first.

      Simone, could you please take a look?

      — Additional comment from Marcelo Tosatti on 2023-06-01 19:49:46 CEST —

      (In reply to Kedar Bidarkar from comment #6)
      > As per Petr from comment5 it appears it needs update in HCO first.
      >
      > Simone, could you please take a look?

      Kedar,

      The HRO support is already there.

      Why did you reassign this bug to Simone?

      — Additional comment from Petr Horáček on 2023-06-02 12:15:36 CEST —

      Despite it can be set through HCO, it is not through a supported API. Whenever a customer uses json patch on HCO, they break their support (unless they have a support exception). What we need on HCO is dedicated attribute on the spec (not a genenic json patch), setting the default runtime class.

      — Additional comment from Marcelo Tosatti on 2023-06-02 18:00:16 CEST —

      (In reply to Petr Horáček from comment #8)
      > Despite it can be set through HCO, it is not through a supported API.
      > Whenever a customer uses json patch on HCO, they break their support (unless
      > they have a support exception). What we need on HCO is dedicated attribute
      > on the spec (not a genenic json patch), setting the default runtime class.

      Petr,

      Can we please open separate BZs to track the different issues?

      (dedicated HCO attribute and per-vm runtimeclass).

      — Additional comment from Simone Tiraboschi on 2023-06-05 11:43:27 CEST —

      (In reply to Kedar Bidarkar from comment #6)
      > As per Petr from comment5 it appears it needs update in HCO first.
      >
      > Simone, could you please take a look?

      Sure, a few questions (for the sake of inline documenting the new configuration option):
      1. can the the value of defaultRuntimeClass be amended as a day two operations when we have existing VMIs?
      2. if so, what's the impact on existing VMIs?
      3. is it going to affect live migration with the target pod getting configured with the new value for defaultRuntimeClass?
      4. is 4.14 enough or should we backport this down to 4.13?

      — Additional comment from Kedar Bidarkar on 2023-06-06 12:05:25 CEST —

      Petr, feel you could help answer Simone's questions from comment10

      — Additional comment from Marcelo Tosatti on 2023-06-09 17:03:54 CEST —

      (In reply to Marcelo Tosatti from comment #9)
      > (In reply to Petr Horáček from comment #8)
      > > Despite it can be set through HCO, it is not through a supported API.
      > > Whenever a customer uses json patch on HCO, they break their support (unless
      > > they have a support exception). What we need on HCO is dedicated attribute
      > > on the spec (not a genenic json patch), setting the default runtime class.
      >
      > Petr,
      >
      > Can we please open separate BZs to track the different issues?
      >
      > (dedicated HCO attribute and per-vm runtimeclass).

      Petr,

      It is not clear to me that per-vm runtime class feature is not necessary (or even
      not high priority):

      https://devopslearners.com/different-container-runtimes-and-configurations-in-the-same-kubernetes-cluster-fed228e1853e

      Motivation

      You can set a different RuntimeClass between different Pods to provide a balance of performance versus security.
      For example, if part of your workload deserves a high level of information security assurance, you might choose
      to schedule those Pods so that they run in a container runtime that uses hardware virtualization. You’d then
      benefit from the extra isolation of the alternative runtime, at the expense of some additional overhead.

      You can also use RuntimeClass to run different Pods with the same container runtime but with different settings.

      What i think is appropriate is to:

      1) Rename this BZ to "allow setting global runtime class in HCO" (temporary workaround).

      2) Create a new bz for per-vm runtime class feature request (proper fix).

      — Additional comment from Petr Horáček on 2023-06-12 11:06:38 CEST —

      (In reply to Marcelo Tosatti from comment #12)
      >
      > Petr,
      >
      > It is not clear to me that per-vm runtime class feature is not necessary (or
      > even
      > not high priority):

      I don't doubt that per-VM runtime class would be useful.

      The way I see it, there are three classes of users:
      A) Having global runtime class setting is good enough to run their DPDK workloads without compromises.
      B) Having per-VM would help them improve workflow and resource utilization, but it is not a must.
      C) Not being able to select runtime class per-VM is a show stopper, they cannot run their combined workloads without it.

      Do you concur?

      While I see the value of per-VM runtime class, I want to productize DPDK for group A first, release it, and only then start extending the support to B and C. That's why https://issues.redhat.com/browse/CNV-24770 does not list per-VM runtime class as a must-have requirement.

      >
      > https://devopslearners.com/different-container-runtimes-and-configurations-
      > in-the-same-kubernetes-cluster-fed228e1853e
      >
      >
      > Motivation
      >
      > You can set a different RuntimeClass between different Pods to provide a
      > balance of performance versus security.
      > For example, if part of your workload deserves a high level of information
      > security assurance, you might choose
      > to schedule those Pods so that they run in a container runtime that uses
      > hardware virtualization. You’d then
      > benefit from the extra isolation of the alternative runtime, at the expense
      > of some additional overhead.
      >
      > You can also use RuntimeClass to run different Pods with the same container
      > runtime but with different settings.
      >
      > —
      >
      > What i think is appropriate is to:
      >
      > 1) Rename this BZ to "allow setting global runtime class in HCO" (temporary
      > workaround).

      Works for me.

      >
      > 2) Create a new bz for per-vm runtime class feature request (proper fix).

      Dedicated per-VM runtime class is tracked as a feature via https://issues.redhat.com/browse/CNV-24676. It is far too big of a feature to be tracked through a BZ.

      — Additional comment from Marcelo Tosatti on 2023-06-13 05:49:50 CEST —

      (In reply to Petr Horáček from comment #13)
      > (In reply to Marcelo Tosatti from comment #12)
      > >
      > > Petr,
      > >
      > > It is not clear to me that per-vm runtime class feature is not necessary (or
      > > even
      > > not high priority):
      >
      > I don't doubt that per-VM runtime class would be useful.
      >
      > The way I see it, there are three classes of users:
      > A) Having global runtime class setting is good enough to run their DPDK
      > workloads without compromises.
      > B) Having per-VM would help them improve workflow and resource utilization,
      > but it is not a must.
      > C) Not being able to select runtime class per-VM is a show stopper, they
      > cannot run their combined workloads without it.
      >
      > Do you concur?

      Yes.

      >
      > While I see the value of per-VM runtime class, I want to productize DPDK for
      > group A first, release it, and only then start extending the support to B
      > and C. That's why https://issues.redhat.com/browse/CNV-24770 does not list
      > per-VM runtime class as a must-have requirement.
      >
      > >
      > > https://devopslearners.com/different-container-runtimes-and-configurations-
      > > in-the-same-kubernetes-cluster-fed228e1853e
      > >
      > >
      > > Motivation
      > >
      > > You can set a different RuntimeClass between different Pods to provide a
      > > balance of performance versus security.
      > > For example, if part of your workload deserves a high level of information
      > > security assurance, you might choose
      > > to schedule those Pods so that they run in a container runtime that uses
      > > hardware virtualization. You’d then
      > > benefit from the extra isolation of the alternative runtime, at the expense
      > > of some additional overhead.
      > >
      > > You can also use RuntimeClass to run different Pods with the same container
      > > runtime but with different settings.
      > >
      > > —
      > >
      > > What i think is appropriate is to:
      > >
      > > 1) Rename this BZ to "allow setting global runtime class in HCO" (temporary
      > > workaround).
      >
      >
      > Works for me.
      >
      > >
      > > 2) Create a new bz for per-vm runtime class feature request (proper fix).
      >
      > Dedicated per-VM runtime class is tracked as a feature via
      > https://issues.redhat.com/browse/CNV-24676. It is far too big of a feature
      > to be tracked through a BZ.

      — Additional comment from Petr Horáček on 2023-06-15 14:21:08 CEST —

      — Additional comment from Ivan on 2023-06-20 09:45:57 CEST —

      @stirabos@redhat.com , regarding your 4th question on comment #10;

      4. is 4.14 enough or should we backport this down to 4.13? --> The end Partner needs this bug to be backported to 4.12 as it would be the version that they will Go Live in September.

      Can you please let me know if you need me to file it or you can duplicate this one for 4.12?

      Thanks in advance!

      — Additional comment from Simone Tiraboschi on 2023-06-26 11:58:10 CEST —

      (In reply to Ivan from comment #16)
      > Can you please let me know if you need me to file it or you can duplicate
      > this one for 4.12?

      OK, thanks.
      We will handle the BZ and the backport process on our side.

              stirabos Simone Tiraboschi
              stirabos Simone Tiraboschi
              Satheesaran Sundaramoorthi Satheesaran Sundaramoorthi
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: