Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18936

ingress-perf reencrypt latency is high in haproxy 2.6.13

XMLWordPrintable

    • Important
    • No
    • 5
    • Sprint 242, Sprint 243
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      cause:
      In OCP 4.14 HAProxy version is upgraded from 2.6.13.

      consequence:
      An increase in P99 latency for re-encrypt traffic is observed when the volume of ingress traffic puts the HAProxy component of the OpenShift IngressController under considerable load. This latency increase does not affect overall throughput, which remains consistent.

      fix:
      The bug did not fix the issue, but there are some tuning Recommendations to mitigate the issue:
      The default OpenShift IngressController is configured with 4 HAProxy threads. If you experience elevated P99 latencies during high ingress traffic conditions, specifically with re-encrypt traffic, it's recommended to increase the number of HAProxy threads to reduce latency.

      result:
      An increase in P99 latency for re-encrypt traffic is observed when the volume of ingress traffic puts the HAProxy component of the OpenShift IngressController under considerable load. Increase the number of HAProxy threads can reduce the latency.
      Show
      cause: In OCP 4.14 HAProxy version is upgraded from 2.6.13. consequence: An increase in P99 latency for re-encrypt traffic is observed when the volume of ingress traffic puts the HAProxy component of the OpenShift IngressController under considerable load. This latency increase does not affect overall throughput, which remains consistent. fix: The bug did not fix the issue, but there are some tuning Recommendations to mitigate the issue: The default OpenShift IngressController is configured with 4 HAProxy threads. If you experience elevated P99 latencies during high ingress traffic conditions, specifically with re-encrypt traffic, it's recommended to increase the number of HAProxy threads to reduce latency. result: An increase in P99 latency for re-encrypt traffic is observed when the volume of ingress traffic puts the HAProxy component of the OpenShift IngressController under considerable load. Increase the number of HAProxy threads can reduce the latency.
    • Known Issue
    • In Progress

      Description of problem:

      ingress-perf test shows haproxy 2.6.13 has a higher reencrypt latency than haproxy 2.2.24, deviation is about 100%. 
      haproxy bump from 2.2.24 to 2.6.13 during OCP 4.14 release.
      edge http and passthrough do not have this issue.
      Rrevert OCP4.14 haproxy from 2.6.13 to 2.2.24 does not have this issue.
      Revert OCP4.14 haproxy 2.6.13 with a change https://github.com/haproxy/haproxy/issues/1914, does not have this issue.

      Version-Release number of selected component (if applicable):

      4.14 after haproxy 2.6.1

      How reproducible:

      Everytime

      Steps to Reproduce:

      1. Install a cluster. AWS OVN, 3 master nodes with type "m5.8xlarge", 24 worker nodes with type m5.2xlarge, 3 infra nodes with type r5.2xlarge.
      2. Run ingress-perf test https://github.com/cloud-bulldozer/ingress-perf, with configuration https://github.com/cloud-bulldozer/e2e-benchmarking/blob/master/workloads/ingress-perf/config/standard.yml. Set requestTimeout to >=5s. In this test ingress pods are moved to infra nodes.
      3. Compare the RPS and Latency the results. 

      Actual results:

      ingress-perf test shows haproxy 2.6.13 has a higher reencrypt latency than haproxy 2.2.24.
      
      Andrew McDermott provided a image to revert haproxy 2.6.13 to haproxy 2.2.24. After using this image the issue is not seen. So this is a regression since haproxy 2.6.13.
      
      Andrew McDermott provided a image quay.io/amcdermo/openshift-router-qiujieli-ocp414-haproxy26-sans-4c48edba4f45bb78f41af7d79d3c176710fe6a90 which reverts the patch identified in https://github.com/haproxy/haproxy/issues/1914. After using this image, the issue is not seen.
      
      Test results are as below:
      
      0902: 4.14.0-0.nightly-2023-09-02-132842
      08-28: 4.14.0-0.nightly-2023-08-28-154013
      revert1914: which reverts the patch identified in https://github.com/haproxy/haproxy/issues/1914
      ec4: doesn't have ovn-ic
      ec2: default haproxy 2.6.13
      ec1: default haproxy 2.2.24
        0902 % to 4.13.9 0902
      (revert1914)
      % to 4.13.9 0828 % to 4.13.9 0528 (revert 2.2.24) % to 4.13.9 ec4  % to 4.13.9 ec4
      (revert1914) 
      % to 4.13.9 ec2 2.6.13 % to 4.13.9 ec2 (revert 2.2.24) % to 4.13.9 ec1 2.2.24 % to 4.13.9 4.13.9
      edge                                      
      RPS (K req/s) 75.33 6.49% 69.44 -1.84% 74.8 5.74% 73.72 4.21% 72.21 2.08% 65.49 -7.42% 75.06 6.11% 72.07 1.88% 72.67 2.73% 70.74
      avg_latency(ms) 50.46 -0.98% 55.63 9.16% 56.95 11.75% 53.22 4.43% 53.16 4.32% 58.16 14.13% 49.02 -3.81% 53.68 5.34% 52.73 3.47% 50.96
      http                                      
      RPS (K req/s) 105.78 -9.78% 106.17 -9.45% 103.83 -11.45% 105.33 -10.17% 115.75 -1.28% 103.45 -11.77% 113.51 -3.19% 117.88 0.54% 122.46 4.44% 117.25
      avg_latency(ms) 42.47 32.02% 41.98 30.49% 45.39 41.09% 36.32 12.90% 31.36 -2.52% 35.42 10.10% 32.77 1.87% 41.73 29.72% 31.55 -1.93% 32.17
      passthrough                                      
      RPS (K req/s) 191.4 -4.78% 183.82 -8.55% 184.74 -8.09% 188.75 -6.10% 202.8 0.89% 198.61 -1.19% 197.61 -1.69% 210.72 4.83% 212.61 5.77% 201.01
      avg_latency(ms) 19.35 7.14% 19.61 8.58% 20.3 12.40% 21.01 16.33% 17.98 -0.44% 19.21 6.37% 18.9 4.65% 17.36 -3.88% 17.62 -2.44% 18.06
      reencrypt                                      
      RPS (K req/s) 73.05 0.74% 65.97 -9.02% 72.19 -0.44% 70.73 -2.45% 67.72 -6.61% 65.3 -9.94% 71.44 -1.48% 68.69 -5.27% 70.06 -3.38% 72.51
      avg_latency(ms) 104.29 107.79% 58.88 17.31% 97.52 94.30% 58.24 16.04% 96.52 92.31% 66.05 31.60% 108.11 115.40% 52.67 4.94% 53.69 6.97% 50.19

       

      Expected results:

      haproxy 2.6.13 should not have performance regression compared to haproxy 2.2.24

      Additional info:

       

       

       

              amcdermo@redhat.com Andrew McDermott
              rhn-support-qili Qiujie Li
              Qiujie Li Qiujie Li
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: