Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-28762

[azure] permissions required on customer vnet when installing private cluster by using workload identity

XMLWordPrintable

    • Important
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide

      * Previously, the Cloud Credential Operator was missing some permissions required to create a private cluster on {azure-first}.
      These missing permissions prevented installation of an {azure-short} private cluster using {entra-first}.
      This release includes the missing permissions and enables installation of an {azure-short} private cluster using {entra-short}.
      (link:https://issues.redhat.com/browse/OCPBUGS-25193[*OCPBUGS-25193*])
      Show
      * Previously, the Cloud Credential Operator was missing some permissions required to create a private cluster on {azure-first}. These missing permissions prevented installation of an {azure-short} private cluster using {entra-first}. This release includes the missing permissions and enables installation of an {azure-short} private cluster using {entra-short}. (link: https://issues.redhat.com/browse/OCPBUGS-25193 [* OCPBUGS-25193 *])
    • Bug Fix
    • In Progress

      This is a clone of issue OCPBUGS-25193. The following is the description of the original issue:

      Description of problem:

      Install private cluster by using azure workload identity, and failed due to no worker machines being provisioned.
      
      install-config:
      ----------------------
      platform:
        azure:
          region: eastus
          networkResourceGroupName: jima971b-12015319-rg
          virtualNetwork: jima971b-vnet
          controlPlaneSubnet: jima971b-master-subnet
          computeSubnet: jima971b-worker-subnet
          resourceGroupName: jima971b-rg
      publish: Internal
      credentialsMode: Manual
      
      Detailed check on cluster and found machine-api/ingress/image-registry operators reported permissions issues and have no access to customer vnet.
      
      $ oc get machine -n openshift-machine-api
      NAME                                  PHASE     TYPE              REGION   ZONE   AGE
      jima971b-qqjb7-master-0               Running   Standard_D8s_v3   eastus   2      5h14m
      jima971b-qqjb7-master-1               Running   Standard_D8s_v3   eastus   3      5h14m
      jima971b-qqjb7-master-2               Running   Standard_D8s_v3   eastus   1      5h15m
      jima971b-qqjb7-worker-eastus1-mtc47   Failed                                      4h52m
      jima971b-qqjb7-worker-eastus2-ph8bk   Failed                                      4h52m
      jima971b-qqjb7-worker-eastus3-hpmvj   Failed                                      4h52m
      
      Errors on worker machine:
      --------------------
        errorMessage: 'failed to reconcile machine "jima971b-qqjb7-worker-eastus1-mtc47":
          network.SubnetsClient#Get: Failure responding to request: StatusCode=403 -- Original
          Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed"
          Message="The client ''705eb743-7c91-4a16-a7cf-97164edc0341'' with object id ''705eb743-7c91-4a16-a7cf-97164edc0341''
          does not have authorization to perform action ''Microsoft.Network/virtualNetworks/subnets/read''
          over scope ''/subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/jima971b-12015319-rg/providers/Microsoft.Network/virtualNetworks/jima971b-vnet/subnets/jima971b-worker-subnet''
          or the scope is invalid. If access was recently granted, please refresh your credentials."'
        errorReason: InvalidConfiguration
      
      After manually creating customer role with missed permissions for machine-api/ingress/cloud-controller-manager/image-registry, and assigning it to machine-api/ingress/cloud-controller-manager/image-registry user-assigned identity on scope of customer vnet, cluster was recovered and became running.
      
      Permissions for machine-api/cloud-controller-manager/ingress on customer vnet:
      "Microsoft.Network/virtualNetworks/subnets/read",
      "Microsoft.Network/virtualNetworks/subnets/join/action"
      
      Permissions for image-registry on customer vnet:
      "Microsoft.Network/virtualNetworks/subnets/read",
      "Microsoft.Network/virtualNetworks/subnets/join/action"
      "Microsoft.Network/virtualNetworks/join/action"

      Version-Release number of selected component (if applicable):

          4.15 nightly build

      How reproducible:

          always on recent 4.15 payload

      Steps to Reproduce:

          1. prepare install-config with private cluster configuration + credentialsMode: Manual
          2. using ccoctl tool to create workload identity
          3. install cluster
          

      Actual results:

          Installation failed due to permission issues

      Expected results:

          ccoctl also needs to assign customer role to machine-api/ccm/image-registry user-assigned identity on scope of customer vnet if it is configured in install-config

      Additional info:

      Issue is only detected on 4.15, it works on 4.14. 

              rh-ee-mold Mark Old
              openshift-crt-jira-prow OpenShift Prow Bot
              Mingxia Huang Mingxia Huang
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: