Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17869

[Azure] Gate NAT gateway feature behind TechPreview

    • No
    • Approved
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      NAT gateway is not yet a supported feature and the current implementation is a partial non-zonal solution.

      Version-Release number of selected component (if applicable):

      4.14

      How reproducible:

      always

      Steps to Reproduce:

      1. Set OutboundType = NatGateway
      2. Deploy cluster
      3.
      

      Actual results:

      Install successful

      Expected results:

      Install requires TechPreviewNoUpgrade before proceeding

      Additional info:

       

            [OCPBUGS-17869] [Azure] Gate NAT gateway feature behind TechPreview

            Per the announcement sent regarding the removal of "Blocker" as an option in the Priority field, this issue (which was already closed at the time of the bulk update) had Priority = "Blocker." It is being updated to Priority = Critical. No additional fields were changed.

            OpenShift Jira Automation Bot added a comment - Per the announcement sent regarding the removal of "Blocker" as an option in the Priority field, this issue (which was already closed at the time of the bulk update) had Priority = "Blocker." It is being updated to Priority = Critical. No additional fields were changed.

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (Important: OpenShift Container Platform 4.14.0 bug fix and security update), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHSA-2023:5006

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Important: OpenShift Container Platform 4.14.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:5006

            Jinyun Ma added a comment -

            Verified on 4.14.0-0.nightly-2023-09-02-132842, installation with NatGateway outbound type on Azure Public Cloud and Azure MAG succeeds. Move bug to VERIFIED.

            NatGateway(including its associate public IP) are created, and attached to master and worker subnet.

            # az network nat gateway list -g jimaaz01-lx5b4-rg
            [
              {
                "etag": "W/\"59890baa-f26e-47d7-b9ea-0fe5fd829d8a\"",
                "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/natGateways/jimaaz01-lx5b4-natgw",
                "idleTimeoutInMinutes": 10,
                "location": "eastus",
                "name": "jimaaz01-lx5b4-natgw",
                "provisioningState": "Succeeded",
                "publicIpAddresses": [
                  {
                    "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/publicIPAddresses/jimaaz01-lx5b4-natgw-pip-v4",
                    "resourceGroup": "jimaaz01-lx5b4-rg"
                  }
                ],
                "resourceGroup": "jimaaz01-lx5b4-rg",
                "resourceGuid": "98f56f1a-ed95-4023-b99c-df0d2f50b149",
                "sku": {
                  "name": "Standard"
                },
                "subnets": [
                  {
                    "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/virtualNetworks/jimaaz01-lx5b4-vnet/subnets/jimaaz01-lx5b4-master-subnet",
                    "resourceGroup": "jimaaz01-lx5b4-rg"
                  },
                  {
                    "id": "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/virtualNetworks/jimaaz01-lx5b4-vnet/subnets/jimaaz01-lx5b4-worker-subnet",
                    "resourceGroup": "jimaaz01-lx5b4-rg"
                  }
                ],
                "tags": {
                  "kubernetes.io_cluster.jimaaz01-lx5b4": "owned"
                },
                "type": "Microsoft.Network/natGateways"
              }
            ]
            

            Jinyun Ma added a comment - Verified on 4.14.0-0.nightly-2023-09-02-132842, installation with NatGateway outbound type on Azure Public Cloud and Azure MAG succeeds. Move bug to VERIFIED. NatGateway(including its associate public IP) are created, and attached to master and worker subnet. # az network nat gateway list -g jimaaz01-lx5b4-rg [   {     "etag" : "W/\" 59890baa-f26e-47d7-b9ea-0fe5fd829d8a\"",     "id" : "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/natGateways/jimaaz01-lx5b4-natgw" ,     "idleTimeoutInMinutes" : 10,     "location" : "eastus" ,     "name" : "jimaaz01-lx5b4-natgw" ,     "provisioningState" : "Succeeded" ,     "publicIpAddresses" : [       {         "id" : "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/publicIPAddresses/jimaaz01-lx5b4-natgw-pip-v4" ,         "resourceGroup" : "jimaaz01-lx5b4-rg"       }     ],     "resourceGroup" : "jimaaz01-lx5b4-rg" ,     "resourceGuid" : "98f56f1a-ed95-4023-b99c-df0d2f50b149" ,     "sku" : {       "name" : "Standard"     },     "subnets" : [       {         "id" : "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/virtualNetworks/jimaaz01-lx5b4-vnet/subnets/jimaaz01-lx5b4-master-subnet" ,         "resourceGroup" : "jimaaz01-lx5b4-rg"       },       {         "id" : "/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jimaaz01-lx5b4-rg/providers/Microsoft.Network/virtualNetworks/jimaaz01-lx5b4-vnet/subnets/jimaaz01-lx5b4-worker-subnet" ,         "resourceGroup" : "jimaaz01-lx5b4-rg"       }     ],     "tags" : {       "kubernetes.io_cluster.jimaaz01-lx5b4" : "owned"     },     "type" : "Microsoft.Network/natGateways"   } ]

            https://github.com/openshift/installer/pull/7455 should fix the issue. I see the NAT being created in a local install.

            Rafael Fonseca dos Santos added a comment - https://github.com/openshift/installer/pull/7455 should fix the issue. I see the NAT being created in a local install.

            jinyunma nice catch! The issue was introduced by https://github.com/openshift/installer/pull/7312. I guess during a rebase, the `OutboundType` variable that is pipe through terraform was removed https://github.com/openshift/installer/pull/7312/files#diff-2674c05dd65332988a049ec9d2e0fbd7c0ab5837902c0695c6f507fc3629c1d9L59

            Rafael Fonseca dos Santos added a comment - jinyunma nice catch! The issue was introduced by https://github.com/openshift/installer/pull/7312. I guess during a rebase, the `OutboundType` variable that is pipe through terraform was removed https://github.com/openshift/installer/pull/7312/files#diff-2674c05dd65332988a049ec9d2e0fbd7c0ab5837902c0695c6f507fc3629c1d9L59

            Jinyun Ma added a comment -

            Verified on 4.14.0-0.nightly-2023-08-28-154013

            1. installer explain doc

            $ ./openshift-install explain installconfig.platform.azure.outboundType
            KIND:     InstallConfig
            VERSION:  v1RESOURCE: <string>
              Default: "Loadbalancer"
              Valid Values: "","Loadbalancer","NatGateway","UserDefinedRouting"
              OutboundType is a strategy for how egress from cluster is achieved. When not specified default is "Loadbalancer". "NatGateway" is only available in TechPreview. 

            2. set outboundType to NatGateway

            $ ./openshift-install create manifests --dir ipi-public/
            ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: invalid "install-config.yaml" file: platform.azure.outboundType: Invalid value: "NatGateway": not supported in this feature set  

            3. create manifests with ASH install-config file and set outboundType to NatGateway

            $ ./openshift-install create manifests --dir ipi-wwt/
            ERROR failed to fetch Master Machines: failed to load asset "Install Config": failed to create install config: invalid "install-config.yaml" file: platform.azure.outboundType: Invalid value: "NatGateway": Azure Stack does not support NAT routing currently  

            4. Install cluster on azure public cloud and MAG with NatGateway as outboundType, both are successful, but strange thing is that NAT gateway resources are not created ( including its public ip). Checked .openshift-install.log, the outboundType changes to "Loadbalancer" from vnet terraform output.

            install-config:
            ---------------------
            platform:
              azure:
                region: eastus
                baseDomainResourceGroupName: os4-common
                outboundType: NatGateway 
            
            .openshift-install.log:
            --------------------------
            ...
            time="2023-08-30T00:57:35Z" level=debug msg="[INFO] running Terraform command: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/terraform/bin/terraform apply -no-color -auto-approve -input=false -var-file=/tmp/openshift-install-vnet-118144905/terraform.tfvars.json -var-file=/tmp/openshift-install-vnet-118144905/terraform.platform.auto.tfvars.json -lock=true -parallelism=10 -refresh=true"
            ...
            time="2023-08-30T01:09:58Z" level=debug msg="Outputs:"
            time="2023-08-30T01:09:58Z" level=debug
            time="2023-08-30T01:09:58Z" level=debug msg="elb_backend_pool_v4_id = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/loadBalancers/jima30az1-d5c67/backendAddressPools/jima30az1-d5c67\""
            time="2023-08-30T01:09:58Z" level=debug msg="identity = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/jima30az1-d5c67-identity\""
            time="2023-08-30T01:09:58Z" level=debug msg="ilb_backend_pool_v4_id = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/loadBalancers/jima30az1-d5c67-internal/backendAddressPools/jima30az1-d5c67\""
            time="2023-08-30T01:09:58Z" level=debug msg="internal_lb_ip_v4_address = \"10.0.0.4\""
            time="2023-08-30T01:09:58Z" level=debug msg="master_subnet_id = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet/subnets/jima30az1-d5c67-master-subnet\""
            time="2023-08-30T01:09:58Z" level=debug msg="nsg_name = \"jima30az1-d5c67-nsg\""
            time="2023-08-30T01:09:58Z" level=debug msg="outbound_type = \"Loadbalancer\""
            time="2023-08-30T01:09:58Z" level=debug msg="public_lb_pip_v4_fqdn = \"jima30az1-d5c67.eastus.cloudapp.azure.com\""
            time="2023-08-30T01:09:58Z" level=debug msg="resource_group_name = \"jima30az1-d5c67-rg\""
            time="2023-08-30T01:09:58Z" level=debug msg="storage_account_name = \"clusterjthjg\""
            time="2023-08-30T01:09:58Z" level=debug msg="subnet_id = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet/subnets/jima30az1-d5c67-master-subnet\""
            time="2023-08-30T01:09:58Z" level=debug msg="virtual_network_id = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet\""
            time="2023-08-30T01:09:58Z" level=debug msg="vm_image = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Compute/galleries/gallery_jima30az1_d5c67/images/jima30az1-d5c67-gen2/versions/414.92.20230803\""
            time="2023-08-30T01:09:58Z" level=debug msg="worker_subnet_id = \"/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet/subnets/jima30az1-d5c67-worker-subnet\""
            ...

            rdossant  could you help to check the issue in item4?

            Jinyun Ma added a comment - Verified on 4.14.0-0.nightly-2023-08-28-154013 1. installer explain doc $ ./openshift-install explain installconfig.platform.azure.outboundType KIND:     InstallConfig VERSION:  v1RESOURCE: <string>   Default: "Loadbalancer"   Valid Values: ""," Loadbalancer "," NatGateway "," UserDefinedRouting"   OutboundType is a strategy for how egress from cluster is achieved. When not specified default is "Loadbalancer" . "NatGateway" is only available in TechPreview. 2. set outboundType to NatGateway $ ./openshift-install create manifests --dir ipi- public / ERROR failed to fetch Master Machines: failed to load asset "Install Config" : failed to create install config: invalid "install-config.yaml" file: platform.azure.outboundType: Invalid value: "NatGateway" : not supported in this feature set  3. create manifests with ASH install-config file and set outboundType to NatGateway $ ./openshift-install create manifests --dir ipi-wwt/ ERROR failed to fetch Master Machines: failed to load asset "Install Config" : failed to create install config: invalid "install-config.yaml" file: platform.azure.outboundType: Invalid value: "NatGateway" : Azure Stack does not support NAT routing currently  4. Install cluster on azure public cloud and MAG with NatGateway as outboundType, both are successful, but strange thing is that NAT gateway resources are not created ( including its public ip). Checked .openshift-install.log, the outboundType changes to "Loadbalancer" from vnet terraform output. install-config: --------------------- platform:   azure:     region: eastus     baseDomainResourceGroupName: os4-common     outboundType: NatGateway .openshift-install.log: -------------------------- ... time= "2023-08-30T00:57:35Z" level=debug msg= "[INFO] running Terraform command: /home/jenkins/ws/workspace/ocp-common/Flexy-install/flexy/workdir/install-dir/terraform/bin/terraform apply -no-color -auto-approve -input= false - var -file=/tmp/openshift-install-vnet-118144905/terraform.tfvars.json - var -file=/tmp/openshift-install-vnet-118144905/terraform.platform.auto.tfvars.json -lock= true -parallelism=10 -refresh= true " ... time= "2023-08-30T01:09:58Z" level=debug msg= "Outputs:" time= "2023-08-30T01:09:58Z" level=debug time= "2023-08-30T01:09:58Z" level=debug msg= "elb_backend_pool_v4_id = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/loadBalancers/jima30az1-d5c67/backendAddressPools/jima30az1-d5c67\"" time= "2023-08-30T01:09:58Z" level=debug msg= "identity = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/jima30az1-d5c67-identity\"" time= "2023-08-30T01:09:58Z" level=debug msg= "ilb_backend_pool_v4_id = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/loadBalancers/jima30az1-d5c67-internal/backendAddressPools/jima30az1-d5c67\"" time= "2023-08-30T01:09:58Z" level=debug msg= "internal_lb_ip_v4_address = \" 10.0.0.4\"" time= "2023-08-30T01:09:58Z" level=debug msg= "master_subnet_id = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet/subnets/jima30az1-d5c67-master-subnet\"" time= "2023-08-30T01:09:58Z" level=debug msg= "nsg_name = \" jima30az1-d5c67-nsg\"" time= "2023-08-30T01:09:58Z" level=debug msg= "outbound_type = \" Loadbalancer\"" time= "2023-08-30T01:09:58Z" level=debug msg= "public_lb_pip_v4_fqdn = \" jima30az1-d5c67.eastus.cloudapp.azure.com\"" time= "2023-08-30T01:09:58Z" level=debug msg= "resource_group_name = \" jima30az1-d5c67-rg\"" time= "2023-08-30T01:09:58Z" level=debug msg= "storage_account_name = \" clusterjthjg\"" time= "2023-08-30T01:09:58Z" level=debug msg= "subnet_id = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet/subnets/jima30az1-d5c67-master-subnet\"" time= "2023-08-30T01:09:58Z" level=debug msg= "virtual_network_id = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet\"" time= "2023-08-30T01:09:58Z" level=debug msg= "vm_image = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Compute/galleries/gallery_jima30az1_d5c67/images/jima30az1-d5c67-gen2/versions/414.92.20230803\"" time= "2023-08-30T01:09:58Z" level=debug msg= "worker_subnet_id = \" /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima30az1-d5c67-rg/providers/Microsoft.Network/virtualNetworks/jima30az1-d5c67-vnet/subnets/jima30az1-d5c67-worker-subnet\"" ... rdossant   could you help to check the issue in item4?

              rdossant Rafael Fonseca dos Santos
              rdossant Rafael Fonseca dos Santos
              Jinyun Ma Jinyun Ma
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: