Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9376

Ingress Controller should not add Local Zones subnets to network routers/LBs (Classic/NLB)

XMLWordPrintable

    • Important
    • None
    • Rejected
    • Unspecified
    • Hide
      Due to outpost, wavelength, and local zone are not supported by AWS NLB or CLB for the moment, it was not possible to create load-balancer-backed Services that include networks from such zones.
      To mitigate this issue, such networks were excluded from adding to a Service backing load-balancer.
      Show
      Due to outpost, wavelength, and local zone are not supported by AWS NLB or CLB for the moment, it was not possible to create load-balancer-backed Services that include networks from such zones. To mitigate this issue, such networks were excluded from adding to a Service backing load-balancer.
    • Bug Fix

      • *Description of problem:*

      The ingress controller is trying to add all subnets included in the VPC to the default router when installing OCP in existing VPC[1] which has subnets in Local Zones[1] (which does not support networking load balancers[3], only application ones/ALB).

      The ingress cluster operator is reporting the following error when installing the cluster:

      ~~~
      ingress False True True 92s The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: ValidationError: You cannot have any Local Zone subnets for load balancers of type 'classic'...
      ~~~

      I managed to workaround it by tagging `kubernetes.io/cluster/unmanaged=true` on the Local Zone subnet so that the ingress controller will ignore that subnet. The name `unmanaged` must be anything different than the `InfraID`. When the tag key suffix is the `InfraID`, regardless the value, it still failing.

      There is work in progress to create the official support (product documentation+QE in progress) of Local Zones installing in existing VPCs[4], then implement the fully support in installer[5][6]. The current issue seems to be a blocker for the full implementation, as the installer tag the subnets with cluster tag[1] `kubernetes.io/cluster/<infraID>=.*`.

      • *OpenShift release version:*

      All versions (tested in 4.10.18, 4.11.0-fc.0, 4.11.0-rc.1)

      • *How reproducible:*

      Always

      • *Steps to Reproduce (in detail):*

      1. Create the VPC
      2. Create the Local Zone subnet, setting the tag `kubernetes.io/cluster/unmanaged=true`
      3. Create the Installer configuration setting the subnets in the "availability-zone" (parent zone)
      4. Create the manifests
      5. Create the Machine Sets for the machines located on the Local Zone subnet
      6. Create the cluster

      • *Actual results:*

      Installer failed due to the ingress operator (and dependencies) reports degraded (message above from cluster operators)

      • *Expected results:*

      The ingress (OR):

      • Should not auto discovery all the subnets in the VPC when the subnets has been set on the install-config.yaml;
      • Should not auto discovery all the subnets in the VPC when the `kubernetes.io/role/elb=1` has been added to public subnets;
      • Should not try to add subnets not supported (Local Zones, Wavelength) to the technology used by Load Balancer (CLB/NLB) on the ingress[7];
      • Auto discover could ignore the tag `kubernetes.io/role/elb=0` when it's set on the public subnet, so we can specify what subnets we would not like to be added/used by Load Balancer;
      • *Impact of the problem:*
      • Installations are not finished when trying to use existig VPCs with subnets in Local Zones (without workaround)
      • Block the fully support of Local Zones/Wavelenght on Installer since the cluster tag must be set on the subnet: `kubernetes.io/cluster/<infraID>=.*`
      • *Additional info:*

      [1] Install in existing VPC:
      https://docs.openshift.com/container-platform/4.10/installing/installing_aws/installing-aws-vpc.html

      [2] Local Zone documentation:
      https://aws.amazon.com/about-aws/global-infrastructure/localzones/

      [3] Local Zones limitations (LB):
      https://aws.amazon.com/about-aws/global-infrastructure/localzones/features/

      [4] Research and Day-0 support documentation to install OCP in existing VPC with Local Zones subnets:
      https://issues.redhat.com/browse/SPLAT-635

      [5] Epic to create Machine pools in Local Zones in existing VPC with Local Zones subnets:
      https://issues.redhat.com/browse/SPLAT-636

      [6] Epic to implement full support on installer to create subnets in Local Zones:
      https://issues.redhat.com/browse/SPLAT-657

      [7] The SDK provides the field indicating the type of the subnet, since the network load balancers (CLB/NLB) the controller should look into the field `ZoneType` and add only subnets on the zone type value `availability-zone`, or ignore zones `wavelength-zone` and `local-zone`.
      ~~~
      $ aws ec2 describe-availability-zones --filters Name=region-name,Values=us-east-1 --all-availability-zones |jq -r '.AvailabilityZones[] | ( .ZoneName, .ZoneType)'
      us-east-1a
      availability-zone
      (...)
      us-east-1-bos-1a
      local-zone
      (...)
      us-east-1-wl1-atl-wlz-1
      wavelength-zone
      (...)
      ~~~

              dmoiseev Denis Moiseev (Inactive)
              rhn-support-mrbraga Marco Braga
              Zhaohua Sun Zhaohua Sun
              Red Hat Employee
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated:
                Resolved: