Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-59494

CPU resource issues with supplementalPoolThreadCount

XMLWordPrintable

    • CNV Storage 271, CNV Storage 272
    • Critical
    • None

      Description of problem:

      When defining the new multi IOthread feature in 4.19.0-100, I see some behavior issues with the automatic cpu resource definitions:
      
      Example configuration:
            domain:
              cpu:
                cores: 1
                sockets: 80
                threads: 1
              ioThreadsPolicy: supplementalPool
              ioThreads:
                supplementalPoolThreadCount: 16
              devices:
                blockMultiQueue: true
                disks:
      
      The #1 issue is that the ThreadCount configures a cpu limit by default. This can be very detrimental to performance when ThreadCount is significantly less than vcpu count!! 
      
      For example with "supplementalPoolThreadCount: 16", 80 vcpus w/ default vmiCPUAllocationRatio=10
      
      virt-launcher:
            name: compute
            resources:
              limits:
                cpu: "16"
                devices.kubevirt.io/kvm: "1"
                devices.kubevirt.io/tun: "1"
                devices.kubevirt.io/vhost-net: "1"
              requests:
                cpu: "8"
                devices.kubevirt.io/kvm: "1"
                devices.kubevirt.io/tun: "1"
                devices.kubevirt.io/vhost-net: "1"
                ephemeral-storage: 50M
                memory: "54718889985"
      
      This also prevents the VM from starting if the supplementalPoolThreadCount is lower than the "calculated" cpu request value for the virt-launcher pod (based on total vcpus and vmiCPUAllocationRatio). For ex. when defining ThreadCount=4 for 80 vcpus (which the ratio will configure as 8 cpu requests), the VM will fail to start with this error:
      
      failed to create virtual machine pod: Pod "virt-launcher-rhel9-tomato-canidae-79-ztl4l" is invalid: spec.containers[0].resources.requests: Invalid value: "8": must be less than or equal to cpu limit of 4
      
      
      The #2 issue -- which we may want to debate on ideal behavior -- is that it seems to be only adjusting cpu limit to the ThreadCount, and not cpu requests. I think the original implementation intention was to set a full cpu request for each IOthread. There are tradeoffs to density for that behavior, but the idea was to prevent host thread saturation. 

      Version-Release number of selected component (if applicable):

      4.19

      How reproducible:

      Always

      Steps to Reproduce:

      Creating VM with default rhel9 template, increasing total vcpus, and manually adding IOthread tunables to the yaml. 
      

      Actual results:

      cpu limits are created

      Expected results:

      no cpu limits should be enforced for IOthreads

              afrosirh Alice Frosi
              jhopper@redhat.com Jenifer Abrams
              Jenia Peimer Jenia Peimer
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: