Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-16808

CCO fails to check if the root credential has sufficient permissions for cr/cloud-credential-operator-gcp-ro-creds in passthrough mode

XMLWordPrintable

    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      CCO fails to check if the root credential has sufficient permissions for cr/cloud-credential-operator-gcp-ro-creds in passthrough mode

      Version-Release number of selected component (if applicable):

       

      How reproducible:

       

      Steps to Reproduce:

      1. Install OCP 4.12.10 on GCP cluster.
      
      2. Switch to passthrough mode:
       $ oc get secret gcp-credentials -n kube-system -o yaml 
      ~~~
      <<snip>>
      kind: Secret
      metadata:
        annotations:
          cloudcredential.openshift.io/mode: passthrough
        creationTimestamp: "2023-06-21T14:22:51Z"
        name: gcp-credentials
        namespace: kube-system 
      <<snip>>
      ~~~
      
      3. 
      NAME               VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
      cloud-credential   4.12.10   True        True          True       25d
      
       $ oc get co cloud-credential -o yaml
      ~~~
      <<snip>>
      - lastTransitionTime: "2023-07-15T19:10:27Z"
          message: 1 of 7 credentials requests are failing to sync.
          reason: CredentialsFailing
          status: "True"
          type: Degraded
        - lastTransitionTime: "2023-07-15T19:10:27Z"
          message: 6 of 7 credentials requests provisioned, 1 reporting errors.
          reason: Reconciling
          status: "True"
          type: Progressing
      ~~~
      
      4. cloud-credential-operator
      ~~~
      apiVersion: cloudcredential.openshift.io/v1
      kind: CredentialsRequest
      <<snip>>
        spec:
          providerSpec:
            apiVersion: cloudcredential.openshift.io/v1
            kind: GCPProviderSpec
            predefinedRoles:
            - roles/iam.securityReviewer
            - roles/iam.roleViewer
            skipServiceCheck: true
          secretRef:
            name: cloud-credential-operator-gcp-ro-creds
            namespace: openshift-cloud-credential-operator
          serviceAccountNames:
          - cloud-credential-operator
        status:
          conditions:
          - lastProbeTime: "2023-07-15T19:10:27Z"
            lastTransitionTime: "2023-07-15T19:10:27Z"
            message: 'failed to grant creds: error determining whether a credentials update  <--------
              is needed'
            reason: CredentialsProvisionFailure  <------
            status: "True"     <------
            type: CredentialsProvisionFailure 2.
      ~~~
      
      5. Cloud-credential-operator pod logs:
      ~~~
      $ oc logs  pod/cloud-credential-operator-b5ff965b8-m2f4f -n openshift-cloud-credential-operator -c cloud-credential-operator 
      
      2023-07-17T10:27:56.479617663Z time="2023-07-17T10:27:56Z" level=error msg="error determining whether a credentials update is needed" actuator=gcp cr=openshift-cloud-credential-operator/cloud-credential-operator-gcp-ro-creds error="error checking whether GCP client has sufficient permissions: error testing permissions: googleapi: Error 400: Permission advisorynotifications.notifications.get is not valid for this resource., badRequest"
      2023-07-17T10:27:56.479691234Z time="2023-07-17T10:27:56Z" level=error msg="error syncing credentials: error determining whether a credentials update is needed" controller=credreq cr=openshift-cloud-credential-operator/cloud-credential-operator-gcp-ro-creds secret=openshift-cloud-credential-operator/cloud-credential-operator-gcp-ro-creds
      2023-07-17T10:27:56.479691234Z time="2023-07-17T10:27:56Z" level=error msg="errored with condition: CredentialsProvisionFailure" controller=credreq cr=openshift-cloud-credential-operator/cloud-credential-operator-gcp-ro-creds secret=openshift-cloud-credential-operator/cloud-credential-operator-gcp-ro-creds
      2023-07-17T10:27:58.480233923Z time="2023-07-17T10:27:58Z" level=info msg="syncing credentials request" controller=credreq cr=openshift-cloud-credential-operator/cloud-credential-operator-gcp-ro-creds 
      ~~~
      
      After few days have passed, the issue has eventually self-resolved on all GCP clusters with no other actions taken.
      
      

      Actual results:

      The cloud-credential cluster operator is degraded due to credential requests failing to sync.

      Expected results:

      The cloud-credential cluster operator should be able to resync all credentialrequests without any issue.

      Additional info:

      Issue happened on all GCP cluster at the same time.
      
      Comparing this regex[1] with the error log from cloud-credential-operator[2], is it possible that the regex no longer matches what GCP API returns, and thus this permission isn't "filtered out" of the mentioned checks? [1] https://github.com/openshift/cloud-credential-operator/blob/e4ce607ad76b040422feec9625fcd0fb50b57d6b/pkg/operator/utils/gcp/utils.go#L86

       

              Unassigned Unassigned
              rhn-support-duge Dushyant Uge
              Jianping Shu Jianping Shu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: