Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-25193

[azure] permissions required on customer vnet when installing private cluster by using workload identity

    XMLWordPrintable

Details

    • Important
    • No
    • Rejected
    • True
    • Hide

      None

      Show
      None
    • Hide
      *Cause*: Some permissions are missed when installing Azure private cluster by using Managed Identity.
      *Consequence*: Failed to provision Azure private cluster by using Managed Identity.
      *Fix*: The missed permissions were added.
      *Result*: Azure private cluster with Managed Identity installations is possible.
      Show
      *Cause*: Some permissions are missed when installing Azure private cluster by using Managed Identity. *Consequence*: Failed to provision Azure private cluster by using Managed Identity. *Fix*: The missed permissions were added. *Result*: Azure private cluster with Managed Identity installations is possible.
    • Bug Fix
    • In Progress

    Description

      Description of problem:

      Install private cluster by using azure workload identity, and failed due to no worker machines being provisioned.
      
      install-config:
      ----------------------
      platform:
        azure:
          region: eastus
          networkResourceGroupName: jima971b-12015319-rg
          virtualNetwork: jima971b-vnet
          controlPlaneSubnet: jima971b-master-subnet
          computeSubnet: jima971b-worker-subnet
          resourceGroupName: jima971b-rg
      publish: Internal
      credentialsMode: Manual
      
      Detailed check on cluster and found machine-api/ingress/image-registry operators reported permissions issues and have no access to customer vnet.
      
      $ oc get machine -n openshift-machine-api
      NAME                                  PHASE     TYPE              REGION   ZONE   AGE
      jima971b-qqjb7-master-0               Running   Standard_D8s_v3   eastus   2      5h14m
      jima971b-qqjb7-master-1               Running   Standard_D8s_v3   eastus   3      5h14m
      jima971b-qqjb7-master-2               Running   Standard_D8s_v3   eastus   1      5h15m
      jima971b-qqjb7-worker-eastus1-mtc47   Failed                                      4h52m
      jima971b-qqjb7-worker-eastus2-ph8bk   Failed                                      4h52m
      jima971b-qqjb7-worker-eastus3-hpmvj   Failed                                      4h52m
      
      Errors on worker machine:
      --------------------
        errorMessage: 'failed to reconcile machine "jima971b-qqjb7-worker-eastus1-mtc47":
          network.SubnetsClient#Get: Failure responding to request: StatusCode=403 -- Original
          Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed"
          Message="The client ''705eb743-7c91-4a16-a7cf-97164edc0341'' with object id ''705eb743-7c91-4a16-a7cf-97164edc0341''
          does not have authorization to perform action ''Microsoft.Network/virtualNetworks/subnets/read''
          over scope ''/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima971b-12015319-rg/providers/Microsoft.Network/virtualNetworks/jima971b-vnet/subnets/jima971b-worker-subnet''
          or the scope is invalid. If access was recently granted, please refresh your credentials."'
        errorReason: InvalidConfiguration
      
      After manually creating customer role with missed permissions for machine-api/ingress/cloud-controller-manager/image-registry, and assigning it to machine-api/ingress/cloud-controller-manager/image-registry user-assigned identity on scope of customer vnet, cluster was recovered and became running.
      
      Permissions for machine-api/cloud-controller-manager/ingress on customer vnet:
      "Microsoft.Network/virtualNetworks/subnets/read",
      "Microsoft.Network/virtualNetworks/subnets/join/action"
      
      Permissions for image-registry on customer vnet:
      "Microsoft.Network/virtualNetworks/subnets/read",
      "Microsoft.Network/virtualNetworks/subnets/join/action"
      "Microsoft.Network/virtualNetworks/join/action"

      Version-Release number of selected component (if applicable):

          4.15 nightly build

      How reproducible:

          always on recent 4.15 payload

      Steps to Reproduce:

          1. prepare install-config with private cluster configuration + credentialsMode: Manual
          2. using ccoctl tool to create workload identity
          3. install cluster
          

      Actual results:

          Installation failed due to permission issues

      Expected results:

          ccoctl also needs to assign customer role to machine-api/ccm/image-registry user-assigned identity on scope of customer vnet if it is configured in install-config

      Additional info:

      Issue is only detected on 4.15, it works on 4.14. 

      Attachments

        Issue Links

          Activity

            People

              rh-ee-mold Mark Old
              jinyunma Jinyun Ma
              Mingxia Huang Mingxia Huang
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated: