Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-2310

Disabling Client IP Preservation for hairpinning issue in CCM, Ingress

XMLWordPrintable

    • Product / Portfolio Work
    • None
    • 0% To Do, 50% In Progress, 50% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • L
    • None
    • None
    • None
    • None
    • None
    • None

      Feature Overview (aka. Goal Summary)  

      This feature aims to resolve the "hairpinning" issue in OpenShift Container Platform (OCP) clusters deployed on AWS, particularly in Hosted Control Plane (HCP) environments, by enabling the disabling of Client IP Preservation for Network Load Balancer (NLB) ingress controller services. This will prevent random connection timeouts that occur when application pods and ingress controller pods reside on the same node and traffic loops back through the same NLB endpoint.The issue is prominent where workloads and ingress can share nodes, common in multi-node HCP clusters that lack dedicated infra nodes.

      The engineering goal for a fix in cloud-provider-aws is OCP 4.21

      The chosen implementation path is enhancing CCM vs. migrating to ALBO to avoid migration challenges. Enhancing CCM is considered less disruptive in the short term.

      Goals (aka. expected user outcomes)

      The primary goals are: * To be able to annotate a LoadBalancer service using NLBs to disable Client IP Preservation

      • The Cluster Ingress Operator (CIO) must be updated to automatically manage the decision of enabling Proxy Protocol for NLBs and setting relevant configurations (e.g., annotations to the Service type-LoadBalancer and environment variables for the router deployment)

       

      Requirements (aka. Acceptance Criteria):

      Requires changes within the cloud-provider-aws component to implement support for disabling Client IP Preservation and enabling Proxy Protocol for NLBs

      Deployment Scenarios: This feature is primarily for OpenShift Container Platform clusters installed in AWS, specifically those using Hosted Control Planes (HCP), such as ROSA HCP. This issue is particularly prevalent in HCP clusters due to the common co-location of workloads and ingress controllers on the same worker nodes.

      Use Cases (Optional):

      Include use case diagrams, main success scenarios, alternative flow scenarios.  Initial completion during Refinement status.

       

      Questions to Answer (Optional):

      Include a list of refinement / architectural questions that may need to be answered before coding can begin.  Initial completion during Refinement status.

      Which long-term strategic path will Red Hat commit to: enhancing the existing CCM with the necessary features, or migrating users to the Application Load Balancer Operator (ALBO). How to address the migration challenges to ALBO. This can be a long term plan.

      Out of Scope

      High-level list of items that are out of scope.  Initial completion during Refinement status.

       

      Background

      Provide any additional context is needed to frame the feature.  Initial completion during Refinement status.

      AWS Network Load Balancers (NLBs) operate at Layer 4 (Transport Layer) and, by default, preserve the client's original IP address, making the NLB transparent. When Client IP Preservation is enabled, the NLB rewrites the destination IP of inbound packets to the server and then rewrites the source IP of response packets back to the client. However, this transparent behavior leads to the hairpinning issue: if a server calls an NLB, and the traffic is routed back to the same server, the operating system sees a packet where the source address is itself, causing the packet to be dropped and the socket to time out. This is a documented issue on AWS re:Post. In OpenShift/ROSA HCP clusters, this issue is exacerbated because ingress controllers (HAProxy pods) and application pods often share the same worker nodes

       

      Customer Considerations

      Provide any additional customer-specific considerations that must be made when designing and delivering the Feature.  Initial completion during Refinement status.

      The current workaround of isolating HAProxy ingress pods to dedicated machine pools is very costly for customers with many clusters, reducing the value proposition of HCP dramatically. This issue is expected to affect every HCP customer eventually

      Documentation Considerations

      Provide information that needs to be considered and planned so that documentation will meet customer needs.  Initial completion during Refinement status.
      The documentation should clearly outline the implications of disabling Client IP Preservation, especially the need for Proxy Protocol V2 configuration on backend services to maintain functionality like access logs and session persistence

      Interoperability Considerations

      Which other projects and versions in our portfolio does this feature impact?  What interoperability test scenarios should be factored by the layered products?  Initial completion during Refinement status.

              rh-ee-smodeel Subin M
              rh-ee-smodeel Subin M
              None
              Marco Braga, Michael McCune, Miciah Masters
              Michael McCune Michael McCune
              Zhaohua Sun Zhaohua Sun
              Jeana Routh Jeana Routh
              Kyle Walker Kyle Walker
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: