Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-75930

vhost is pinned to same pCPU as qemu-kvm and virt-launcher, causing delays.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • CNV v4.23.0
    • CNV v4.20.3
    • CNV Virt-Node
    • None
    • Quality / Stability / Reliability
    • 0.42
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • Moderate
    • None

      Description of problem:

      While troubleshooting a latency issue, I've noticed we are potentially delaying the delivery of a network packet to a Guest, because the Guest is tuned with the highest possible performance options and the emulator threads are pinned, all to the same pCPU

      Version-Release number of selected component (if applicable):

      4.20.3

      How reproducible:

      Always

      Steps to Reproduce:

      1. Setup the VM
      
              cpu:
                cores: 1
                dedicatedCpuPlacement: true
                isolateEmulatorThread: true
                numa:
                  guestMappingPassthrough: {}
                realtime: {}
                sockets: 2
                threads: 1
      
      
      2. Start it
      
      3. Check vhost is pinned to the same CPU as virt-launcher and qemu-kvm
      
      107       102374  102364  2 00:23 ?        00:00:00 /usr/bin/virt-launcher --qemu-timeout 346s --name test .....
      107       102477  102364 75 00:23 ?        00:00:00 /usr/libexec/qemu-kvm -name guest=homelab_test, .....
      root      102482       2  0 00:23 ?        00:00:00 [vhost-102477]
      
      $ taskset -cp 102374
      pid 102374's current affinity list: 6
      $ taskset -cp 102477
      pid 102477's current affinity list: 6
      $ taskset -cp 102482
      pid 102482's current affinity list: 6 
      
      Even with realtime enabled, there is nothing set on vhost:
      
      $ chrt -p 102374
      pid 102374's current scheduling policy: SCHED_OTHER
      pid 102374's current scheduling priority: 0
      $ chrt -p 102482
      pid 102482's current scheduling policy: SCHED_OTHER
      pid 102482's current scheduling priority: 0

      Actual results:

      virt-launcher, qemu-kvm and vhost can compete with the same CPU.

      Expected results:

      One CPU for each? Or at least preemption/priority?

      Additional info:

      PIDs will differ from above, but look at the virt-launcher hoarding the CPU and causing a delay for vhost-net to run
      
         virt-launcher-2221771 [012] d..2. 262238.193703: sched_stat_runtime:   comm=virt-launcher pid=2221771 runtime=10840 [ns]
         virt-launcher-2221771 [012] d..2. 262238.193704: sched_stat_wait:      comm=virt-launcher pid=2221377 delay=37875 [ns]
         virt-launcher-2221771 [012] d..2. 262238.193705: sched_switch:         virt-launcher:2221771 [120] S ==> virt-launcher:2221377 [120]
         virt-launcher-2221377 [012] d..2. 262238.193709: sched_stat_runtime:   comm=virt-launcher pid=2221377 runtime=6451 [ns]
         virt-launcher-2221377 [012] d..2. 262238.193711: sched_stat_wait:      comm=vhost-2221997 pid=2222169 delay=10073994 [ns]
         vhost-2221997-2222169 [012] d..1. 262238.193753: softirq_raise:        vec=3 [action=NET_RX]
      
      That's a 10073994ns = 10.7ms delay for vhost to run, sitting on runqueue, potentially delaying networking operations of the guest (it did a NET_RX as soon as it ran).

        1. ftrace.txt
          5.50 MB
          Germano Veit Michel

              sgott@redhat.com Stuart Gott
              rhn-support-gveitmic Germano Veit Michel
              Denys Shchedrivyi Denys Shchedrivyi
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: