Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18011

ingress-perf test shows 4.14 has lower tps than 4.13.z on http and passthrough

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Major Major
    • None
    • 4.14
    • Networking / router
    • Important
    • No
    • 8
    • Sprint 241, Sprint 242
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      ingress-perf test shows 4.14 has lower tps than 4.13.z on http and passthrough. AVG Deviation >15%.
      edge and reencrypt also have some degradation but not as big as http and passthrough.

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-08-11-055332 (HAProxy version 2.6.13)
      v.s.
      4.13.9 (HA-Proxy version 2.2.24)

      How reproducible:

      Every time, repeat the comparison test 6 times on AWS OVN.
      I also see this on other cloud provider, like vshpere-8 on IBM Cloud. It's not WAS specific.

      Steps to Reproduce:

      1. Install a cluster. AWS OVN, 3 master nodes with type "m5.8xlarge", 24 worker nodes with type m5.2xlarge, 3 infra nodes with type r5.2xlarge.
      2. Run ingress-perf test https://github.com/cloud-bulldozer/ingress-perf, with configuration https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/ingress-perf/config/standard.yml. In this test ingress pods are moved to infra nodes.
      3. Compare the RPS and Latency the results.
      

      Actual results:

      ingress-perf test shows 4.14 has lower tps than 4.13.z on http and passthrough.
      AVG Deviation >15%

      Expected results:

      4.14 should not have performance downgrade compared to 4.13.9

      Additional info:

      Check the monitoring metrics on the infra nodes, the cpu and memory usage of router pods are similar.
      But on 4.13, the NetworkUtilization metrics are higher than 4.14.

      The comparison results:

      http 4.14 TEST1 4.13.9 TEST1 Deviation
      RPS (K req/s) 98.48 118.65 -17.00%
        4.14 TEST2 4.13.9 TEST2  
      RPS (K req/s) 98.15 122.40 -19.81%
        4.14 TEST3 4.13.9 TEST3  
      RPS (K req/s) 99.87 112.78 -11.45%
      AVG RPS 98.83 117.94 -16.20%
        4.14 TEST1 4.13.9 TEST1  
      avg_latency(ms) 38.07 33.36 14.12%
        4.14 TEST2 4.13.9 TEST2  
      avg_latency(ms) 42.8 29.57 44.74%
        4.14 TEST3 4.13.9 TEST3  
      avg_latency(ms) 40.05 32.32 23.92%
      AVG avg_latency 40.31 31.75 26.95%

       

      passthrough 4.14 TEST1 4.13.9 TEST1 Deviation
      RPS (K req/s) 170.45 216.54 -21.28%
        4.14 TEST2 4.13.9 TEST2  
      RPS (K req/s) 170.65 219.84 -22.38%
        4.14 TEST3 4.13.9 TEST3  
      RPS (K req/s) 172.94 205.38 -15.80%
      AVG RPS 171.35 213.92 -19.90%
        4.14 TEST1 4.13.9 TEST1  
      avg_latency(ms) 21.5 18.22 18.00%
        4.14 TEST2 4.13.9 TEST2  
      avg_latency(ms) 24.56 16.34 50.31%
        4.14 TEST3 4.13.9 TEST3  
      avg_latency(ms) 20.85 17.79 17.20%
      AVG avg_latency 22.30 17.45 27.81%

      More detailed data on ingress-perf test result openshift perfscale qe - Google Sheets

      I uploaded the metrics grafana dashboard snapshot to https://drive.google.com/drive/folders/1_JygsI0-CV7IYgSNz3Yb0vyCub-tz3Bq?usp=drive_link 

              amcdermo@redhat.com Andrew McDermott
              rhn-support-qili Qiujie Li
              Qiujie Li Qiujie Li
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: