Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43571

Internal LB in Azure IPI is not zone-redundant

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      Cause: Internal load balancers on Azure were locating the IP address in only a single availability zone

      Consequence: The load balancer was not zone redundant. If an issue caused a failure in the availability zone, the load balancer could become unavailable

      Fix: Make the load balancer zone redundant by locating the ip address in multiple zones

      Result: the load balancer can withstand failures in one or more zones
      Show
      Cause: Internal load balancers on Azure were locating the IP address in only a single availability zone Consequence: The load balancer was not zone redundant. If an issue caused a failure in the availability zone, the load balancer could become unavailable Fix: Make the load balancer zone redundant by locating the ip address in multiple zones Result: the load balancer can withstand failures in one or more zones
    • Bug Fix
    • In Progress

      Description of problem:

      Microsoft local team is performing an resiliency analysis on a customer. They find out that the internal api LB created by the Azure IPI installation is not zone-redundant.
      
      After some common investigation with them, the internal-lb-ip-v4 terraform frontend_ip_configuration (https://github.com/openshift/installer/blob/release-4.16/data/data/azure/vnet/internal-lb.tf#L12-L32) does not include the zone list (https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/lb#zones).
      
      This issue does not happen with the external api lb, as it is linked with an IP that is regional (though zone-redundant). The internal one does not use this way to attach IP, and face this issue.
      
      If the internal LB is not zone-redundant, in case a zone has some issue, the communication on the internal components through this LB may be unresponsive.

      Version-Release number of selected component (if applicable):

          4.14 (checked also in 4.16)

      How reproducible:

          Always

      Steps to Reproduce:

          1. Azure IPI installation (ipv4 tested)

      Actual results:

      Zonal api-int LB

      Expected results:

      Zone-redundant api-int LB

      Additional info:

      Extracted from microsoft data on the LB:
      
          
      {
                              "name": "internal-lb-ip-v4",
                              "id": "[concat(resourceId('Microsoft.Network/loadBalancers', parameters('loadBalancers_clusterid_internal_name')), '/frontendIPConfigurations/internal-lb-ip-v4')]",
                              "properties": {
                                  "privateIPAddress": "100.118.0.84",
                                  "privateIPAllocationMethod": "Dynamic",
                                  "subnet": {
                                      "id": "[concat(parameters('virtualNetworks_sharedsvcs_vnt_xxx_externalid'), '/subnets/master')]"
                                  },
                                  "privateIPAddressVersion": "IPv4"
                              }
                          },
                          {
                              "name": "ab5369a91a7814ab8b5f757b3f9e877b",
                              "id": "[concat(resourceId('Microsoft.Network/loadBalancers', parameters('loadBalancers_clusterid_internal_name')), '/frontendIPConfigurations/ab5369a91a7814ab8b5f757b3f9e877b')]",
                              "properties": {
                                  "privateIPAddress": "100.118.2.7",
                                  "privateIPAllocationMethod": "Dynamic",
                                  "subnet": {
                                      "id": "[concat(parameters('virtualNetworks_sharedsvcs_vnt_xxx_externalid'), '/subnets/worker')]"
                                  },
                                  "privateIPAddressVersion": "IPv4"
                              },
                              "zones": [
                                  "1",
                                  "2",
                                  "3"
                              ]
                          }
      
      As we can see, the internal LB created at installation time does not have the "zones" parameter, but the other one created by the cluster have them.

              jhixson_redhat John Hixson
              rgordill1@redhat.com Ramon Gordillo Gutierrez
              Jinyun Ma Jinyun Ma
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: