Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-41965

[gcp][shared_vpc] Cannot create shared vpc cluster for openshift v4.17.0-rc.2-candidate

XMLWordPrintable

    • Critical
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:
      [gcp][shared_vpc] Cannot create shared vpc cluster for openshift v4.17.0-rc.2-candidate

      Version-Release number of selected component (if applicable):
      openshift v4.17.0-rc.2-candidate

      How reproducible:
      100%

      Steps to Reproduce:
      1. create shared vpc cluster with following cmd:
      $ ocm create cluster yisun-914-xpn-417-rc2 --provider gcp --service-account-file /home/yisun/lento-sun/ocm_token/credentials/.gcp/osd-ccs-admin.json --ccs --region us-central1 --vpc-project-id openshift-qe-shared-vpc --vpc-name installer-shared-vpc --control-plane-subnet installer-shared-vpc-subnet-1 --compute-subnet installer-shared-vpc-subnet-2 --channel-group candidate --version 4.17.0-rc.2-candidate

      ID: 2dpiek7j3fm85g5lb0thdhkjvol18vtd
      External ID:
      Name: yisun-914-xpn-417-rc2
      Domain Prefix: a1i8q3i6l8l3n8b
      Display Name: yisun-914-xpn-417-rc2
      State: waiting

      2. grant SA permissions
      $ cat grant_sa_permission.sh

      #!/usr/bin/env bash
      
      sa_email=""
      if [ $# -ge 1 ]; then
      	sa_email="$1"
      fi
      
      if [ -z "${sa_email}" ]; then
      	echo -e "\nPlease provide the service account email.\n\nUsage: $0 <iam-account-email> [<mode>]\n" && exit 1
      fi
      
      echo -e "\n(1/4) Before updating IAM policy binding in the service & host project..."
      ./list_roles.sh "${sa_email}"
      
      echo -e "\n(2/4) Updating IAM policy binding in the service project..."
      #interested_roles=("roles/compute.networkAdmin" "roles/iam.securityReviewer")
      interested_roles=("roles/iam.securityReviewer")
      for role in "${interested_roles[@]}"; do
      	CMD="gcloud projects remove-iam-policy-binding openshift-qe-group-i-osd --member=\"serviceAccount:${sa_email}\" --role \"${role}\" 1>/dev/null"
      	echo -e "\nRunning Command: ${CMD}"
      	eval "${CMD}"
      done
      
      echo -e "\n(3/4) Updating IAM policy binding in the host project..."
      interested_roles=("roles/compute.networkAdmin" "roles/compute.securityAdmin" "roles/dns.admin")
      for role in "${interested_roles[@]}"; do
      	CMD="gcloud projects add-iam-policy-binding openshift-qe-shared-vpc --member=\"serviceAccount:${sa_email}\" --role \"${role}\" 1>/dev/null"
      	echo -e "\nRunning Command: ${CMD}"
      	eval "${CMD}"
      done
      
      echo -e "\n(4/4) After updating IAM policy binding in the service & host project..."
      
      

      $ sh grant_sa_permission.sh osd-managed-admin-g74sxwdt@openshift-qe-group-i-osd.iam.gserviceaccount.com

      3. wait for around 12mins, cluster enter error status
      $ sh time_cluster_ready.sh 2dpiek7j3fm85g5lb0thdhkjvol18vtd
      Cluster '2dpiek7j3fm85g5lb0thdhkjvol18vtd' is now in 'installing' status after 0 mins.
      Cluster '2dpiek7j3fm85g5lb0thdhkjvol18vtd' is now in 'installing' status after 1 mins.
      Cluster '2dpiek7j3fm85g5lb0thdhkjvol18vtd' is now in 'installing' status after 2 mins.
      ...
      Cluster '2dpiek7j3fm85g5lb0thdhkjvol18vtd' is now in 'installing' status after 12 mins.
      Cluster '2dpiek7j3fm85g5lb0thdhkjvol18vtd' is now in 'error' status after 13 mins.

      4. in install log, some error happens, detailed info can be found in attachment: install.log

      level=error msg=failed to fetch Cluster: failed to generate asset \\\"Cluster\\\": failed to create cluster: failed during pre-provisioning: failed to add roles for shared VPC: failed to get IAM policy, unexpected error: googleapi: Error 403: The caller does not have permission, forbidden\\nlevel=warning msg=Failed to extract host addresses: failed to get bootstrap address: wrong number of bootstrap manifests found: []. Expected exactly one\\nlevel=fatal msg=must provide bootstrap host address\\n\" installID=wpgvdngm\ntime=\"2024-09-14T11:48:21Z\" level=debug msg=\"no additional log fields found\" installID=wpgvdngm\ntime=\"2024-09-14T11:48:21Z\" level=error msg=\"failed due to install error\" error=\"exit status 4\" installID=wpgvdngm\ntime=\"2024-09-14T11:48:21Z\" level=fatal msg=\"runtime error\" error=\"exit status 4\"\n"
      

      Expected results:
      Cluster should be created successfully.

      Additional info:
      With same steps to create cluster in default version (4.16.10), it's successful.

      $ ocm create cluster yisun-914-xpn-default-version --provider gcp --service-account-file /home/yisun/lento-sun/ocm_token/credentials/.gcp/osd-ccs-admin.json --ccs --region us-central1 --vpc-project-id openshift-qe-shared-vpc --vpc-name installer-shared-vpc --control-plane-subnet installer-shared-vpc-subnet-1 --compute-subnet installer-shared-vpc-subnet-2
      
      ID:			2dphntfv830ff7i2bpvnbjvvt69pjcvh
      External ID:		
      Name:			yisun-914-xpn-default-version
      Domain Prefix:		e7t3b9t7l7t3c9q
      Display Name:		yisun-914-xpn-default-version
      State:			waiting 
      
      ...
      
      $ sh grant_sa_permission.sh osd-managed-admin-d6skvpgm@openshift-qe-group-i-osd.iam.gserviceaccount.com
      
      $ sh time_cluster_ready.sh 2dphntfv830ff7i2bpvnbjvvt69pjcvh
      Cluster '2dphntfv830ff7i2bpvnbjvvt69pjcvh' is now in 'installing' status after 0 mins.
      Cluster '2dphntfv830ff7i2bpvnbjvvt69pjcvh' is now in 'installing' status after 1 mins.
      Cluster '2dphntfv830ff7i2bpvnbjvvt69pjcvh' is now in 'installing' status after 2 mins.
      ...
      Cluster '2dphntfv830ff7i2bpvnbjvvt69pjcvh' is now in 'ready' status after 35 mins.
      

              rh-ee-bbarbach Brent Barbachem
              rhn-support-yisun Yi Sun
              Yi Sun Yi Sun
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: