-
Bug
-
Resolution: Won't Do
-
Undefined
-
None
-
rhel-9.3.0
-
Yes
-
None
-
subs-client-tools
-
0
-
False
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
Unspecified
-
None
-
57,005
Description of problem:
Installation of insights-client on RHEL-9.3 has negative effect on throughput and cpu utilisation on Intel Ice-lake systems with Mellanox Connectx6 NIC
Version-Release number of selected component (if applicable):
insights-client-3.1.7-12.el9.noarch
How reproducible:
100%
Steps to Reproduce:
1. Install RHEL-9.3 run 16 parallel iperf instances, to saturate full 200gbit link bandwidth
iperf3 --json --client 172.16.1.26 --time 30 --port 5201 --affinity 0,0 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5202 --affinity 1,1 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5203 --affinity 2,2 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5204 --affinity 3,3 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5205 --affinity 4,4 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5206 --affinity 5,5 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5207 --affinity 6,6 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5208 --affinity 7,7 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5209 --affinity 8,8 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5210 --affinity 9,9 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5211 --affinity 10,10 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5212 --affinity 11,11 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5213 --affinity 12,12 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5214 --affinity 13,13 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5215 --affinity 14,14 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5216 --affinity 15,15 --parallel 8
total_throughput=126744.93
efficiency:
sender 38495.04
receiver 8083.17
streams:
receiver:mlx5_core_1:inet4:0:5201 throughput=1631.35 retransmits=5202.0
receiver:mlx5_core_1:inet4:0:5202 throughput=14599.66 retransmits=23966.0
receiver:mlx5_core_1:inet4:0:5203 throughput=1808.24 retransmits=6069.0
receiver:mlx5_core_1:inet4:0:5204 throughput=1217.48 retransmits=6514.0
receiver:mlx5_core_1:inet4:0:5205 throughput=2048.64 retransmits=9765.0
receiver:mlx5_core_1:inet4:0:5206 throughput=1728.49 retransmits=4694.0
receiver:mlx5_core_1:inet4:0:5207 throughput=2703.95 retransmits=8629.0
receiver:mlx5_core_1:inet4:0:5208 throughput=7688.01 retransmits=12064.0
receiver:mlx5_core_1:inet4:0:5209 throughput=1671.44 retransmits=4469.0
receiver:mlx5_core_1:inet4:0:5210 throughput=15176.81 retransmits=19171.0
receiver:mlx5_core_1:inet4:0:5211 throughput=17138.93 retransmits=12477.0
receiver:mlx5_core_1:inet4:0:5212 throughput=20369.80 retransmits=10101.0
receiver:mlx5_core_1:inet4:0:5213 throughput=17628.61 retransmits=20087.0
receiver:mlx5_core_1:inet4:0:5214 throughput=1695.05 retransmits=4854.0
receiver:mlx5_core_1:inet4:0:5215 throughput=13487.04 retransmits=13873.0
receiver:mlx5_core_1:inet4:0:5216 throughput=6151.43 retransmits=8465.0
sender:
sender cpu 0 total=13.33 usr= 0.07 sys= 3.80 irq= 1.03 soft= 8.44
sender cpu 1 total=30.32 usr= 0.30 sys=23.21 irq= 0.90 soft= 5.91
sender cpu 2 total=12.54 usr= 0.07 sys= 3.75 irq= 0.93 soft= 7.79
sender cpu 3 total= 8.16 usr= 0.07 sys= 2.94 irq= 0.63 soft= 4.53
sender cpu 4 total= 8.66 usr= 0.07 sys= 3.53 irq= 0.57 soft= 4.50
sender cpu 5 total=11.77 usr= 0.03 sys= 3.48 irq= 0.96 soft= 7.30
sender cpu 6 total=11.34 usr= 0.07 sys= 4.53 irq= 0.77 soft= 5.97
sender cpu 7 total=16.27 usr= 0.13 sys=11.85 irq= 0.57 soft= 3.71
sender cpu 8 total= 8.77 usr= 0.03 sys= 3.12 irq= 0.73 soft= 4.88
sender cpu 9 total=33.26 usr= 0.23 sys=23.45 irq= 1.04 soft= 8.53
sender cpu 10 total=39.45 usr= 0.30 sys=26.60 irq= 1.37 soft=11.18
sender cpu 11 total=37.25 usr= 0.33 sys=31.14 irq= 0.80 soft= 4.98
sender cpu 12 total=37.65 usr= 0.33 sys=27.68 irq= 1.10 soft= 8.54
sender cpu 13 total= 9.59 usr= 0.07 sys= 3.35 irq= 0.76 soft= 5.41
sender cpu 14 total=29.88 usr= 0.30 sys=20.92 irq= 0.94 soft= 7.72
sender cpu 15 total=19.85 usr= 0.13 sys= 9.86 irq= 1.47 soft= 8.39
receiver:
receiver cpu 0 total=99.40 usr= 0.03 sys= 5.77 irq= 0.23 soft=93.37
receiver cpu 1 total=75.41 usr= 0.92 sys=55.28 irq= 1.03 soft=18.18
receiver cpu 2 total=99.43 usr= 0.07 sys= 6.74 irq= 0.23 soft=92.40
receiver cpu 3 total=99.43 usr= 0.00 sys= 5.93 irq= 0.37 soft=93.13
receiver cpu 4 total=99.43 usr= 0.03 sys=10.73 irq= 0.50 soft=88.17
receiver cpu 5 total=99.40 usr= 0.03 sys= 5.96 irq= 0.23 soft=93.17
receiver cpu 6 total=99.40 usr= 0.03 sys=13.60 irq= 0.43 soft=85.34
receiver cpu 7 total=99.40 usr= 0.10 sys=32.87 irq= 0.70 soft=65.73
receiver cpu 8 total=99.43 usr= 0.00 sys= 5.97 irq= 0.20 soft=93.26
receiver cpu 9 total=99.40 usr= 0.27 sys=53.80 irq= 0.97 soft=44.37
receiver cpu 10 total=99.43 usr= 0.30 sys=59.75 irq= 0.97 soft=38.41
receiver cpu 11 total=99.40 usr= 0.33 sys=69.07 irq= 0.97 soft=29.03
receiver cpu 12 total=99.40 usr= 0.33 sys=58.47 irq= 0.87 soft=39.73
receiver cpu 13 total=99.43 usr= 0.03 sys= 6.17 irq= 0.23 soft=93.00
receiver cpu 14 total=99.43 usr= 0.17 sys=48.78 irq= 1.07 soft=49.42
receiver cpu 15 total=99.43 usr= 0.07 sys=27.38 irq= 1.00 soft=70.99
- We have only 126 gbit throughput over 200gbit network link, bottleneck is receiver, which has all its cpus utilized to 100% *
2. dnf -y remove insights client && reboot
3. run the same test scenario again:
perf3 --json --client 172.16.1.26 --time 30 --port 5201 --affinity 0,0 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5202 --affinity 1,1 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5203 --affinity 2,2 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5204 --affinity 3,3 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5205 --affinity 4,4 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5206 --affinity 5,5 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5207 --affinity 6,6 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5208 --affinity 7,7 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5209 --affinity 8,8 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5210 --affinity 9,9 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5211 --affinity 10,10 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5212 --affinity 11,11 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5213 --affinity 12,12 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5214 --affinity 13,13 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5215 --affinity 14,14 --parallel 8
iperf3 --json --client 172.16.1.26 --time 30 --port 5216 --affinity 15,15 --parallel 8
mpstats and iperfs started
iperfs results stored
mpstats results stored
total_throughput=169832.03
efficiency:
sender 31424.19
receiver 14189.21
streams:
receiver:mlx5_core_1:inet4:0:5201 throughput=9843.62 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5202 throughput=10758.80 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5203 throughput=8418.24 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5204 throughput=9284.23 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5205 throughput=7561.73 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5206 throughput=7176.14 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5207 throughput=10880.63 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5208 throughput=15293.55 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5209 throughput=13089.26 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5210 throughput=12018.62 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5211 throughput=11918.24 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5212 throughput=12073.78 retransmits=0.0
receiver:mlx5_core_1:inet4:0:5213 throughput=11426.15 retransmits=326.0
receiver:mlx5_core_1:inet4:0:5214 throughput=13600.04 retransmits=127.0
receiver:mlx5_core_1:inet4:0:5215 throughput=11189.18 retransmits=73.0
receiver:mlx5_core_1:inet4:0:5216 throughput=5299.83 retransmits=0.0
sender:
sender cpu 0 total=34.26 usr= 0.20 sys=18.04 irq= 1.81 soft=14.21
sender cpu 1 total=34.52 usr= 0.20 sys=19.46 irq= 1.68 soft=13.18
sender cpu 2 total=27.13 usr= 0.17 sys=15.72 irq= 1.14 soft=10.10
sender cpu 3 total=28.55 usr= 0.20 sys=16.64 irq= 1.27 soft=10.43
sender cpu 4 total=26.56 usr= 0.13 sys=13.33 irq= 1.34 soft=11.75
sender cpu 5 total=27.34 usr= 0.23 sys=13.19 irq= 1.51 soft=12.42
sender cpu 6 total=35.61 usr= 0.23 sys=19.95 irq= 1.94 soft=13.49
sender cpu 7 total=44.25 usr= 0.27 sys=27.46 irq= 1.68 soft=14.85
sender cpu 8 total=41.92 usr= 0.23 sys=23.61 irq= 1.74 soft=16.33
sender cpu 9 total=37.92 usr= 0.27 sys=21.68 irq= 1.91 soft=14.07
sender cpu 10 total=37.65 usr= 0.24 sys=21.11 irq= 1.95 soft=14.35
sender cpu 11 total=36.53 usr= 0.20 sys=21.44 irq= 1.65 soft=13.24
sender cpu 12 total=32.74 usr= 0.20 sys=20.27 irq= 1.37 soft=10.89
sender cpu 13 total=40.21 usr= 0.23 sys=24.66 irq= 1.64 soft=13.67
sender cpu 14 total=32.72 usr= 0.23 sys=20.02 irq= 1.27 soft=11.20
sender cpu 15 total=21.21 usr= 0.17 sys= 9.70 irq= 1.54 soft= 9.80
receiver:
receiver cpu 0 total=92.86 usr= 0.44 sys=36.63 irq= 2.59 soft=53.20
receiver cpu 1 total=83.81 usr= 0.59 sys=36.86 irq= 2.36 soft=44.00
receiver cpu 2 total=62.56 usr= 0.91 sys=31.71 irq= 1.96 soft=27.98
receiver cpu 3 total=58.83 usr= 1.10 sys=34.14 irq= 1.83 soft=21.76
receiver cpu 4 total=65.92 usr= 0.63 sys=28.11 irq= 2.18 soft=35.00
receiver cpu 5 total=66.68 usr= 0.60 sys=27.38 irq= 2.18 soft=36.52
receiver cpu 6 total=70.85 usr= 1.01 sys=37.31 irq= 2.12 soft=30.41
receiver cpu 7 total=92.89 usr= 0.82 sys=49.44 irq= 2.32 soft=40.31
receiver cpu 8 total=79.83 usr= 0.95 sys=44.38 irq= 1.94 soft=32.56
receiver cpu 9 total=86.96 usr= 0.70 sys=41.02 irq= 2.41 soft=42.84
receiver cpu 10 total=76.49 usr= 1.03 sys=41.05 irq= 2.06 soft=32.35
receiver cpu 11 total=67.88 usr= 1.16 sys=39.81 irq= 1.63 soft=25.28
receiver cpu 12 total=87.55 usr= 0.69 sys=40.29 irq= 2.07 soft=44.50
receiver cpu 13 total=85.56 usr= 0.97 sys=45.82 irq= 2.26 soft=36.50
receiver cpu 14 total=77.51 usr= 0.74 sys=37.87 irq= 1.88 soft=37.03
receiver cpu 15 total=37.97 usr= 0.68 sys=21.07 irq= 1.89 soft=14.33
- we have achieved 170gbps of throughput even with some spare receiver cpu time*
Actual results:
Performance of RHEL-9.3 is worse with insights-client installed, than without insights-client
Expected results:
Insights client does not have negative impact on cpu performance
Additional info:
We believe that this regression is caused by insights enabling cgroup v2 cpu controller. We have very similar BZ related to systemd:
https://bugzilla.redhat.com/show_bug.cgi?id=2173996
- external trackers