Uploaded image for project: 'OpenShift Specialist Platform Team'
  1. OpenShift Specialist Platform Team
  2. SPLAT-2533

[Tech Preview] AWS/Router/NLB: Spike support for hairpining traffic solution on OpenShift private routers

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • None
    • [Tech Preview] AWS/Router/NLB: Spike support for hairpining traffic solution on OpenShift private routers
    • Product / Portfolio Work
    • OCPSTRAT-2310Disabling Client IP Preservation for hairpinning issue in CCM, Ingress
    • 100% To Do, 0% In Progress, 0% Done
    • True
    • Show
      2025.08.20: Waiting for PMs definition of ownership of CIO changes/next steps: https://issues.redhat.com/browse/SPLAT-2324?focusedId=27829070&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-27829070
    • False
    • Green
    • L
    • 39

      Epic Goal

      • Research required changes in the router component to fix the hairpinning connection issue through the Service "Traffic Configuration", target group attributes, for a Service type-loadBalancer NLB when creating a Service on CCM-AWS (cloud-provider-aws)
      • The "Traffic Configuration" attributes are composed by:
        • Preserve client IP addresses
        • Proxy protocol v2
      • The CC-AWS changes is merged on OCP for 4.21 through SPLAT-2324

      Why is this important?

      Client IP preservation causes that, if a pod opens a connection to the load balancer service and that connection is sent to the same node where the pod resides, the connection fails. This makes mandatory dedicating nodes to ingress controllers, which is not preferred for ROSA HCP.

      The problem was recently exposed through an e2e[1] in the upstream project, an issue has been created[2]. More details in the research Epic SPLAT-2257.

      As aligned with Network Edge team[3], CCM changes is a must to allow Cluster Ingress Controller control attributes when it requires to enable Proxy Protocol on NLB for default routers services on Internal Publish Strategy clusters (a.k.a private).

      This is a bug on any OpenShift private deployments when using NLB. (OCPBUGS-58456 )

      [1] https://github.com/kubernetes/cloud-provider-aws/pull/1161#issuecomment-3080713501 

      [2] https://github.com/kubernetes/cloud-provider-aws/issues/1160 

      [3] https://redhat-internal.slack.com/archives/CCH60A77E/p1749137394974759?thread_ts=1745435593.239899&cid=CCH60A77E 

      Scenarios

      As a user of an OpenShift Container Platform cluster installed in AWS, I want to be able to:

      • Annotate a LoadBalancer service that uses NLBs Target Group attribute of "Client IP preservation" to disable when using target type instance so that client IP will be NATed by NLB addresses, and won't have more the SYN-ACK packet issues from the server (hairpin connection issue)
      • Annotate a LoadBalancer service that uses NLB's Target Group attribute of "Proxy Protocol v2" to enable so that kubernetes features and routing won't be affected when need to know source ip addresses.
      • provide a mechanism to ingress controller that uses NLB to configure service attributes so that default routers can be safety deployed in private environments, specially in ROSA HCP.
      •  

      Acceptance Criteria

      • Spike of required work for CIO consuming CCM-AWS hairpin traffix fixes

      Dependencies (internal and external)

      1. Which Epic will cover the cluster-ingress-operator changes to enable PROXY to haproxy, as well patching the router Service objects when private clusters with NLB?

      Previous Work (Optional):

      1. https://issues.redhat.com/browse/SPLAT-2257
      2. https://github.com/kubernetes/cloud-provider-aws/pull/1161
      3. https://github.com/kubernetes/cloud-provider-aws/issues/1160
      4.  

      Open questions::

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

      Additional References:

              rhn-support-mrbraga Marco Braga
              rhn-support-rvanderp Richard Vanderpool
              Marc Curry, Richard Vanderpool, Subin M
              None
              Milind Yadav Milind Yadav
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: