-
Bug
-
Resolution: Done
-
Major
-
None
-
rhel-9.3.0
-
Yes
-
Important
-
rhel-sst-virtualization-networking
-
ssg_virtualization
-
1
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
All
-
None
Description of problem:
I found there are about 30% performance regression in kvm on both TCP_STREAM and TCP_RR tests between 6.3.0-rc6+ and 6.4.0+, I identified the root cause by commit “6e890c5d5021ca7e69bbe203fde42447874d9a82”( vhost: use vhost_tasks for worker threads).
detail results:
- git bisect bad
6e890c5d5021ca7e69bbe203fde42447874d9a82 is the first bad commit
commit 6e890c5d5021ca7e69bbe203fde42447874d9a82
Author: Mike Christie <michael.christie@oracle.com>
Date: Fri Mar 10 16:03:32 2023 -0600
vhost: use vhost_tasks for worker threads
For vhost workers we use the kthread API which inherit's its values from
and checks against the kthreadd thread. This results in the wrong RLIMITs
being checked, so while tools like libvirt try to control the number of
threads based on the nproc rlimit setting we can end up creating more
threads than the user wanted.
This patch has us use the vhost_task helpers which will inherit its
values/checks from the thread that owns the device similar to if we did
a clone in userspace. The vhost threads will now be counted in the nproc
rlimits. And we get features like cgroups and mm sharing automatically,
so we can remove those calls.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
drivers/vhost/vhost.c | 60 +++++++++++----------------------------------------
drivers/vhost/vhost.h | 4 ++--
Version-Release number of selected component (if applicable):
How reproducible:
always
Steps to Reproduce:
1.boot a vm with vhost
2.run "netserver" on guest
3.run "netperf" client on external host like:
numactl --cpunodebind=0 --membind=0 `command -v python python3 | head -1 ` /tmp/netperf_agent.py 1 /tmp/netperf-2.7.1/src/netperf -D 1 -H 192.168.58.112 -l 15.0 -C -c -t TCP_STREAM – -m 64
Actual results:
Expected results:
Additional info:
#cat /tmp/netperf_agent.py
#!/usr/bin/python
import os
import sys
if len(sys.argv) < 4:
print(""" netperf agent usage:
%s [session_number] [netperf_path] [netperf_parameters_str]
$session_number: number of client sessions
$netperf_path: client path
$netperf_parameter_str: netperf parameters string""" % sys.argv[0])
sys.exit()
n = int(sys.argv[1])
path = sys.argv[2]
params = " ".join(sys.argv[3:])
for i in range(n - 1):
os.system("%s %s &" % (path, params))
os.system("%s %s" % (path, params))