-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
4.14
-
Important
-
No
-
8
-
Sprint 241, Sprint 242
-
2
-
Rejected
-
False
-
Description of problem:
ingress-perf test shows 4.14 has lower tps than 4.13.z on http and passthrough. AVG Deviation >15%. edge and reencrypt also have some degradation but not as big as http and passthrough.
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2023-08-11-055332 (HAProxy version 2.6.13) v.s. 4.13.9 (HA-Proxy version 2.2.24)
How reproducible:
Every time, repeat the comparison test 6 times on AWS OVN. I also see this on other cloud provider, like vshpere-8 on IBM Cloud. It's not WAS specific.
Steps to Reproduce:
1. Install a cluster. AWS OVN, 3 master nodes with type "m5.8xlarge", 24 worker nodes with type m5.2xlarge, 3 infra nodes with type r5.2xlarge. 2. Run ingress-perf test https://github.com/cloud-bulldozer/ingress-perf, with configuration https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/ingress-perf/config/standard.yml. In this test ingress pods are moved to infra nodes. 3. Compare the RPS and Latency the results.
Actual results:
ingress-perf test shows 4.14 has lower tps than 4.13.z on http and passthrough. AVG Deviation >15%
Expected results:
4.14 should not have performance downgrade compared to 4.13.9
Additional info:
Check the monitoring metrics on the infra nodes, the cpu and memory usage of router pods are similar. But on 4.13, the NetworkUtilization metrics are higher than 4.14.
The comparison results:
http | 4.14 TEST1 | 4.13.9 TEST1 | Deviation |
RPS (K req/s) | 98.48 | 118.65 | -17.00% |
4.14 TEST2 | 4.13.9 TEST2 | ||
RPS (K req/s) | 98.15 | 122.40 | -19.81% |
4.14 TEST3 | 4.13.9 TEST3 | ||
RPS (K req/s) | 99.87 | 112.78 | -11.45% |
AVG RPS | 98.83 | 117.94 | -16.20% |
4.14 TEST1 | 4.13.9 TEST1 | ||
avg_latency(ms) | 38.07 | 33.36 | 14.12% |
4.14 TEST2 | 4.13.9 TEST2 | ||
avg_latency(ms) | 42.8 | 29.57 | 44.74% |
4.14 TEST3 | 4.13.9 TEST3 | ||
avg_latency(ms) | 40.05 | 32.32 | 23.92% |
AVG avg_latency | 40.31 | 31.75 | 26.95% |
passthrough | 4.14 TEST1 | 4.13.9 TEST1 | Deviation |
RPS (K req/s) | 170.45 | 216.54 | -21.28% |
4.14 TEST2 | 4.13.9 TEST2 | ||
RPS (K req/s) | 170.65 | 219.84 | -22.38% |
4.14 TEST3 | 4.13.9 TEST3 | ||
RPS (K req/s) | 172.94 | 205.38 | -15.80% |
AVG RPS | 171.35 | 213.92 | -19.90% |
4.14 TEST1 | 4.13.9 TEST1 | ||
avg_latency(ms) | 21.5 | 18.22 | 18.00% |
4.14 TEST2 | 4.13.9 TEST2 | ||
avg_latency(ms) | 24.56 | 16.34 | 50.31% |
4.14 TEST3 | 4.13.9 TEST3 | ||
avg_latency(ms) | 20.85 | 17.79 | 17.20% |
AVG avg_latency | 22.30 | 17.45 | 27.81% |
More detailed data on ingress-perf test result openshift perfscale qe - Google Sheets
I uploaded the metrics grafana dashboard snapshot to https://drive.google.com/drive/folders/1_JygsI0-CV7IYgSNz3Yb0vyCub-tz3Bq?usp=drive_link