Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-7167

[upstream][kvm] Patch "vhost: use vhost_tasks for worker threads" introduces 30% performance degradation

    • Yes
    • Important
    • rhel-sst-virtualization-networking
    • ssg_virtualization
    • 1
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:

      I found there are about 30% performance regression in kvm on both TCP_STREAM and TCP_RR tests between 6.3.0-rc6+ and 6.4.0+, I identified the root cause by commit “6e890c5d5021ca7e69bbe203fde42447874d9a82”( vhost: use vhost_tasks for worker threads).

      detail results:

      TCP_STREAM: http://kvm-perf.hosts.qa.psi.pek2.redhat.com//results/regression/2023-7-3-network-upstream/bad-2/xl710.bridge_test.1q.*netperf.with_jumbo.host_guest.html

      TCP_RR: http://kvm-perf.hosts.qa.psi.pek2.redhat.com//results/regression/2023-7-3-network-upstream/bad-2/xl710.bridge_test.1q.*netperf.default.host_guest.html

      1. git bisect bad
        6e890c5d5021ca7e69bbe203fde42447874d9a82 is the first bad commit
        commit 6e890c5d5021ca7e69bbe203fde42447874d9a82
        Author: Mike Christie <michael.christie@oracle.com>
        Date: Fri Mar 10 16:03:32 2023 -0600

      vhost: use vhost_tasks for worker threads

      For vhost workers we use the kthread API which inherit's its values from
      and checks against the kthreadd thread. This results in the wrong RLIMITs
      being checked, so while tools like libvirt try to control the number of
      threads based on the nproc rlimit setting we can end up creating more
      threads than the user wanted.

      This patch has us use the vhost_task helpers which will inherit its
      values/checks from the thread that owns the device similar to if we did
      a clone in userspace. The vhost threads will now be counted in the nproc
      rlimits. And we get features like cgroups and mm sharing automatically,
      so we can remove those calls.

      Signed-off-by: Mike Christie <michael.christie@oracle.com>
      Acked-by: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
      Signed-off-by: Christian Brauner <brauner@kernel.org>

      drivers/vhost/vhost.c | 60 +++++++++++----------------------------------------
      drivers/vhost/vhost.h | 4 ++--
      Version-Release number of selected component (if applicable):

      How reproducible:
      always

      Steps to Reproduce:
      1.boot a vm with vhost
      2.run "netserver" on guest
      3.run "netperf" client on external host like:
      numactl --cpunodebind=0 --membind=0 `command -v python python3 | head -1 ` /tmp/netperf_agent.py 1 /tmp/netperf-2.7.1/src/netperf -D 1 -H 192.168.58.112 -l 15.0 -C -c -t TCP_STREAM – -m 64

      Actual results:

      Expected results:

      Additional info:
      #cat /tmp/netperf_agent.py

      #!/usr/bin/python

      import os
      import sys

      if len(sys.argv) < 4:
      print(""" netperf agent usage:
      %s [session_number] [netperf_path] [netperf_parameters_str]

      $session_number: number of client sessions
      $netperf_path: client path
      $netperf_parameter_str: netperf parameters string""" % sys.argv[0])
      sys.exit()

      n = int(sys.argv[1])
      path = sys.argv[2]
      params = " ".join(sys.argv[3:])

      for i in range(n - 1):
      os.system("%s %s &" % (path, params))
      os.system("%s %s" % (path, params))

              mtsirkin Michael S Tsirkin
              wquan@redhat.com Wenli Quan
              Michael S Tsirkin Michael S Tsirkin
              Wenli Quan Wenli Quan
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: