Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-21926

Azure CCM unable to manage Load Balancer in Azure Managed Identity Installs

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • 4.14.z
    • 4.14.0, 4.15.0
    • None
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      *Cause*: An Azure Managed Identity role was omitted from cloud-controller-manager (CCM) service account
      *Consequence*: CCM cannot properly manage Service type LoadBalancers in environments deployed to existing vnets with private publishing method.
      *Fix*: The missing role was added to ccoctl.
      *Result*: Azure Managed Identity installations into existing vnet with private publishing is now possible.
      Show
      *Cause*: An Azure Managed Identity role was omitted from cloud-controller-manager (CCM) service account *Consequence*: CCM cannot properly manage Service type LoadBalancers in environments deployed to existing vnets with private publishing method. *Fix*: The missing role was added to ccoctl. *Result*: Azure Managed Identity installations into existing vnet with private publishing is now possible.
    • Bug Fix

      This is a clone of issue OCPBUGS-21745. The following is the description of the original issue:

      Description of problem:

      Upon installing 4.14.0-rc.6 in a cluster with private load balancer publishing and existing vnets Service type LoadBalancers lack permissions necessary to sync.

      Version-Release number of selected component (if applicable):

      4.14.0-rc.6

      How reproducible:

      Seemingly 100%

      Steps to Reproduce:

      1. Install w/ azure Managed Identity into an existing vnet with private LB publishing
      2.
      3.
      

      Actual results:

                      One or more other status conditions indicate a degraded state: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 403, RawError: {"error":{"code":"AuthorizationFailed","message":"The client '194d5669-cb47-4199-a673-4b32a4a110be' with object id '194d5669-cb47-4199-a673-4b32a4a110be' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/14b86a40-8d8f-4e69-abaf-42cbb0b8a331/resourceGroups/net/providers/Microsoft.Network/virtualNetworks/rnd-we-net/subnets/paas1' or the scope is invalid. If access was recently granted, please refresh your credentials."}}
      
      Operators dependent on Ingress are failing as well.
      authentication                             4.14.0-rc.6   False       False         True       149m    OAuthServerRouteEndpointAccessibleControllerAvailable: Get https://oauth-openshift.apps.cnb10161.rnd.westeurope.example.com/healthz: dial tcp: lookup oauth-openshift.apps.cnb10161.rnd.westeurope.example.com on 10.224.0.10:53: no such host (this is likely result of malfunctioning DNS server)
      console                                    4.14.0-rc.6   False       True          False      142m    DeploymentAvailable: 0 replicas available for console deployment...

       

      Expected results:

      Successful install

      Additional info:

      The client ID in the error correspond to “openshift-cloud-controller-manager-azure-cloud-credentials” which indeed when checking its Azure managed identity only has access to cluster RG and not the network RG.
      
      Additionally, they note that this permission is granted to the MAPI roles just not the CCM roles.

       

              jstuever@redhat.com Jeremiah Stuever
              openshift-crt-jira-prow OpenShift Prow Bot
              Mingxia Huang Mingxia Huang
              Jeana Routh Jeana Routh
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: