Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-32435

[release-4.15] Ingress traffic degradation after upgrade to 4.14

XMLWordPrintable

    • Critical
    • Yes
    • 1
    • Sprint 252
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, a cluster upgraded to {product-title} 4.14 or later experienced router pods unexpectedly closing `keep-alive` connections that caused traffic degradation issues for Apache HTTP clients. This issue was caused by router pods using a version of an HAProxy router that closed idle connections after the HAProxy router was restarted. With this release, the pods use a version of an HAProxy router that includes an `idle-close-on-response` option. The HAProxy router now waits for the last request and response transaction before the idle connection is closed. (link:https://issues.redhat.com/browse/OCPBUGS-32435[*OCPBUGS-32435*])
      Show
      * Previously, a cluster upgraded to {product-title} 4.14 or later experienced router pods unexpectedly closing `keep-alive` connections that caused traffic degradation issues for Apache HTTP clients. This issue was caused by router pods using a version of an HAProxy router that closed idle connections after the HAProxy router was restarted. With this release, the pods use a version of an HAProxy router that includes an `idle-close-on-response` option. The HAProxy router now waits for the last request and response transaction before the idle connection is closed. (link: https://issues.redhat.com/browse/OCPBUGS-32435 [* OCPBUGS-32435 *])
    • Bug Fix
    • Done

      This is a clone of issue OCPBUGS-32044. The following is the description of the original issue:

      Description of problem:

      We have an escalation for a customer case where after upgrading to OCP 4.14 they started to see application traffic degradation that seems to be related to the new version of the HAProxy that changed from 2.2.24 to 2.6.13. 
      
      Was already tested by the customer that if the router pods use the old haproxy-router image from OCP 4.12 the issue disappears.
      
      What was observed is that router pods unexpectedly close HTTP keep-alive connection sending a FIN packet while the client is still sending HTTP requests.
      
      

      Version-Release number of selected component (if applicable):

      4.14.16 (HAProxy 2.6)

      How reproducible:

      Only on customer clusters.

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

      Router pods are unexpectedly terminating Keep-alive connections.

      Expected results:

      Router pods should not terminate a keep-live connection when requests are still coming.

      Additional info:

      - Was already tried to change the hard-stop to 20m
      - Was already tried to change the reload interval to the maximum (2m)
      - Was already tried to set `no option idle-close-on-response` in the defualt section of the HAProxy configuration
      - Was already verified that content-length headers have a value grater than 0 in the HTTP requests.
      

            alebedev@redhat.com Andrey Lebedev
            openshift-crt-jira-prow OpenShift Prow Bot
            Melvin Joseph Melvin Joseph
            ANDREW MCDERMOTT
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: