Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-28762

[azure] permissions required on customer vnet when installing private cluster by using workload identity


    • Important
    • No
    • Rejected
    • False
    • Hide



      This is a clone of issue OCPBUGS-25193. The following is the description of the original issue:

      Description of problem:

      Install private cluster by using azure workload identity, and failed due to no worker machines being provisioned.
          region: eastus
          networkResourceGroupName: jima971b-12015319-rg
          virtualNetwork: jima971b-vnet
          controlPlaneSubnet: jima971b-master-subnet
          computeSubnet: jima971b-worker-subnet
          resourceGroupName: jima971b-rg
      publish: Internal
      credentialsMode: Manual
      Detailed check on cluster and found machine-api/ingress/image-registry operators reported permissions issues and have no access to customer vnet.
      $ oc get machine -n openshift-machine-api
      NAME                                  PHASE     TYPE              REGION   ZONE   AGE
      jima971b-qqjb7-master-0               Running   Standard_D8s_v3   eastus   2      5h14m
      jima971b-qqjb7-master-1               Running   Standard_D8s_v3   eastus   3      5h14m
      jima971b-qqjb7-master-2               Running   Standard_D8s_v3   eastus   1      5h15m
      jima971b-qqjb7-worker-eastus1-mtc47   Failed                                      4h52m
      jima971b-qqjb7-worker-eastus2-ph8bk   Failed                                      4h52m
      jima971b-qqjb7-worker-eastus3-hpmvj   Failed                                      4h52m
      Errors on worker machine:
        errorMessage: 'failed to reconcile machine "jima971b-qqjb7-worker-eastus1-mtc47":
          network.SubnetsClient#Get: Failure responding to request: StatusCode=403 -- Original
          Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed"
          Message="The client ''705eb743-7c91-4a16-a7cf-97164edc0341'' with object id ''705eb743-7c91-4a16-a7cf-97164edc0341''
          does not have authorization to perform action ''Microsoft.Network/virtualNetworks/subnets/read''
          over scope ''/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima971b-12015319-rg/providers/Microsoft.Network/virtualNetworks/jima971b-vnet/subnets/jima971b-worker-subnet''
          or the scope is invalid. If access was recently granted, please refresh your credentials."'
        errorReason: InvalidConfiguration
      After manually creating customer role with missed permissions for machine-api/ingress/cloud-controller-manager/image-registry, and assigning it to machine-api/ingress/cloud-controller-manager/image-registry user-assigned identity on scope of customer vnet, cluster was recovered and became running.
      Permissions for machine-api/cloud-controller-manager/ingress on customer vnet:
      Permissions for image-registry on customer vnet:

      Version-Release number of selected component (if applicable):

          4.15 nightly build

      How reproducible:

          always on recent 4.15 payload

      Steps to Reproduce:

          1. prepare install-config with private cluster configuration + credentialsMode: Manual
          2. using ccoctl tool to create workload identity
          3. install cluster

      Actual results:

          Installation failed due to permission issues

      Expected results:

          ccoctl also needs to assign customer role to machine-api/ccm/image-registry user-assigned identity on scope of customer vnet if it is configured in install-config

      Additional info:

      Issue is only detected on 4.15, it works on 4.14. 

            rh-ee-mold Mark Old
            openshift-crt-jira-prow OpenShift Prow Bot
            Mingxia Huang Mingxia Huang
            0 Vote for this issue
            9 Start watching this issue