Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-37442

[CAPI Azure] Failed to create second cluster in shared vnet

XMLWordPrintable

    • Important
    • None
    • Installer (PB) Sprint 258, Installer (PB) Sprint 259
    • 2
    • Approved
    • False
    • Hide

      None

      Show
      None
    • Hide
      When users install a second cluster using existing vnets, cluster install fails

      CAPI fixes the front end IP address of the API server load balancer to 10.0.0.100 if not specified resulting in a second load balancer install failing as the IP address is already taken by the first cluster.

      Added a dynamic IP check to see if default value is available and pick the next available IP if it's in use.

      Second cluster now installs successfully with a different load balancer IP
      Show
      When users install a second cluster using existing vnets, cluster install fails CAPI fixes the front end IP address of the API server load balancer to 10.0.0.100 if not specified resulting in a second load balancer install failing as the IP address is already taken by the first cluster. Added a dynamic IP check to see if default value is available and pick the next available IP if it's in use. Second cluster now installs successfully with a different load balancer IP
    • Bug Fix
    • In Progress

      Description of problem:

      Failed to create second cluster in shared vnet, below error is thrown out during creating network infrastructure when creating 2nd cluster, installer timed out and exited.
      ==============
      07-23 14:09:27.315  level=info msg=Waiting up to 15m0s (until 6:24AM UTC) for network infrastructure to become ready...
      ...
      07-23 14:16:14.900  level=debug msg=	failed to reconcile cluster services: failed to reconcile AzureCluster service loadbalancers: failed to create or update resource jima0723b-1-x6vpp-rg/jima0723b-1-x6vpp-internal (service: loadbalancers): PUT https://management.azure.com/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-1-x6vpp-rg/providers/Microsoft.Network/loadBalancers/jima0723b-1-x6vpp-internal
      07-23 14:16:14.900  level=debug msg=	--------------------------------------------------------------------------------
      07-23 14:16:14.901  level=debug msg=	RESPONSE 400: 400 Bad Request
      07-23 14:16:14.901  level=debug msg=	ERROR CODE: PrivateIPAddressIsAllocated
      07-23 14:16:14.901  level=debug msg=	--------------------------------------------------------------------------------
      07-23 14:16:14.901  level=debug msg=	{
      07-23 14:16:14.901  level=debug msg=	  "error": {
      07-23 14:16:14.901  level=debug msg=	    "code": "PrivateIPAddressIsAllocated",
      07-23 14:16:14.901  level=debug msg=	    "message": "IP configuration /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-1-x6vpp-rg/providers/Microsoft.Network/loadBalancers/jima0723b-1-x6vpp-internal/frontendIPConfigurations/jima0723b-1-x6vpp-internal-frontEnd is using the private IP address 10.0.0.100 which is already allocated to resource /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-49hnw-rg/providers/Microsoft.Network/loadBalancers/jima0723b-49hnw-internal/frontendIPConfigurations/jima0723b-49hnw-internal-frontEnd.",
      07-23 14:16:14.902  level=debug msg=	    "details": []
      07-23 14:16:14.902  level=debug msg=	  }
      07-23 14:16:14.902  level=debug msg=	}
      07-23 14:16:14.902  level=debug msg=	--------------------------------------------------------------------------------
      
      Install-config for 1st cluster:
      =========
      metadata:
        name: jima0723b
      platform:
        azure:
          region: eastus
          baseDomainResourceGroupName: os4-common
          networkResourceGroupName: jima0723b-rg
          virtualNetwork: jima0723b-vnet
          controlPlaneSubnet: jima0723b-master-subnet
          computeSubnet: jima0723b-worker-subnet
      publish: External
      
      Install-config for 2nd cluster:
      ========
      metadata:
        name: jima0723b-1
      platform:
        azure:
          region: eastus
          baseDomainResourceGroupName: os4-common
          networkResourceGroupName: jima0723b-rg
          virtualNetwork: jima0723b-vnet
          controlPlaneSubnet: jima0723b-master-subnet
          computeSubnet: jima0723b-worker-subnet
      publish: External
      
      shared master subnet/worker subnet:
      $ az network vnet subnet list -g jima0723b-rg --vnet-name jima0723b-vnet -otable
      AddressPrefix    Name                     PrivateEndpointNetworkPolicies    PrivateLinkServiceNetworkPolicies    ProvisioningState    ResourceGroup
      ---------------  -----------------------  --------------------------------  -----------------------------------  -------------------  ---------------
      10.0.0.0/24      jima0723b-master-subnet  Disabled                          Enabled                              Succeeded            jima0723b-rg
      10.0.1.0/24      jima0723b-worker-subnet  Disabled                          Enabled                              Succeeded            jima0723b-rg
      
      internal lb frontedIPConfiguration on 1st cluster:
      $ az network lb show -n jima0723b-49hnw-internal -g jima0723b-49hnw-rg --query 'frontendIPConfigurations'
      [
        {
          "etag": "W/\"7a7531ca-fb02-48d0-b9a6-d3fb49e1a416\"",
          "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-49hnw-rg/providers/Microsoft.Network/loadBalancers/jima0723b-49hnw-internal/frontendIPConfigurations/jima0723b-49hnw-internal-frontEnd",
          "inboundNatRules": [
            {
              "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-49hnw-rg/providers/Microsoft.Network/loadBalancers/jima0723b-49hnw-internal/inboundNatRules/jima0723b-49hnw-master-0",
              "resourceGroup": "jima0723b-49hnw-rg"
            },
            {
              "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-49hnw-rg/providers/Microsoft.Network/loadBalancers/jima0723b-49hnw-internal/inboundNatRules/jima0723b-49hnw-master-1",
              "resourceGroup": "jima0723b-49hnw-rg"
            },
            {
              "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-49hnw-rg/providers/Microsoft.Network/loadBalancers/jima0723b-49hnw-internal/inboundNatRules/jima0723b-49hnw-master-2",
              "resourceGroup": "jima0723b-49hnw-rg"
            }
          ],
          "loadBalancingRules": [
            {
              "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-49hnw-rg/providers/Microsoft.Network/loadBalancers/jima0723b-49hnw-internal/loadBalancingRules/LBRuleHTTPS",
              "resourceGroup": "jima0723b-49hnw-rg"
            },
            {
              "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-49hnw-rg/providers/Microsoft.Network/loadBalancers/jima0723b-49hnw-internal/loadBalancingRules/sint-v4",
              "resourceGroup": "jima0723b-49hnw-rg"
            }
          ],
          "name": "jima0723b-49hnw-internal-frontEnd",
          "privateIPAddress": "10.0.0.100",
          "privateIPAddressVersion": "IPv4",
          "privateIPAllocationMethod": "Static",
          "provisioningState": "Succeeded",
          "resourceGroup": "jima0723b-49hnw-rg",
          "subnet": {
            "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima0723b-rg/providers/Microsoft.Network/virtualNetworks/jima0723b-vnet/subnets/jima0723b-master-subnet",
            "resourceGroup": "jima0723b-rg"
          },
          "type": "Microsoft.Network/loadBalancers/frontendIPConfigurations"
        }
      ]
      
      From above output, privateIPAllocationMethod is static and always allocate privateIPAddress to 10.0.0.100, this might cause the 2nd cluster installation failure.
      
      Checked the same on cluster created by using terraform, privateIPAllocationMethod is dynamic.
      ===============
      $ az network lb show -n wxjaz723-pm99k-internal -g wxjaz723-pm99k-rg --query 'frontendIPConfigurations'
      [
        {
          "etag": "W/\"e6bec037-843a-47ba-a725-3f322564be58\"",
          "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/wxjaz723-pm99k-rg/providers/Microsoft.Network/loadBalancers/wxjaz723-pm99k-internal/frontendIPConfigurations/internal-lb-ip-v4",
          "loadBalancingRules": [
            {
              "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/wxjaz723-pm99k-rg/providers/Microsoft.Network/loadBalancers/wxjaz723-pm99k-internal/loadBalancingRules/api-internal-v4",
              "resourceGroup": "wxjaz723-pm99k-rg"
            },
            {
              "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/wxjaz723-pm99k-rg/providers/Microsoft.Network/loadBalancers/wxjaz723-pm99k-internal/loadBalancingRules/sint-v4",
              "resourceGroup": "wxjaz723-pm99k-rg"
            }
          ],
          "name": "internal-lb-ip-v4",
          "privateIPAddress": "10.0.0.4",
          "privateIPAddressVersion": "IPv4",
          "privateIPAllocationMethod": "Dynamic",
          "provisioningState": "Succeeded",
          "resourceGroup": "wxjaz723-pm99k-rg",
          "subnet": {
            "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/wxjaz723-rg/providers/Microsoft.Network/virtualNetworks/wxjaz723-vnet/subnets/wxjaz723-master-subnet",
            "resourceGroup": "wxjaz723-rg"
          },
          "type": "Microsoft.Network/loadBalancers/frontendIPConfigurations"
        },
      ...
      ]

      Version-Release number of selected component (if applicable):

        4.17 nightly build

      How reproducible:

        Always

      Steps to Reproduce:

          1. Create shared vnet / master subnet / worker subnet
          2. Create 1st cluster in shared vnet
          3. Create 2nd cluster in shared vnet
          

      Actual results:

          2nd cluster installation failed

      Expected results:

          Both clusters are installed successfully.

      Additional info:

          

       

              rna-afk Aditya Narayanaswamy
              jinyunma Jinyun Ma
              Jinyun Ma Jinyun Ma
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: