Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14781

[gcp] New machine stuck in Provisioning when delete one zone from cpms on gcp xpn cluster

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Major
    • None
    • 4.13, 4.14
    • None
    • Moderate
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      New machine stuck in Provisioning when delete one zone from cpms on gcp xpn, report "The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound"

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-06-05-112833

      How reproducible:

      Always

      Steps to Reproduce:

      1. Set up an gcp private cluster, by default cpms contains a,b,c,f four failureDomains, 3 masters are in a,b,c
            failureDomains:
              gcp:
              - zone: us-central1-a
              - zone: us-central1-b
              - zone: us-central1-c
              - zone: us-central1-f
      2. Delete one failureDomain a, now failureDomains look like below:
            failureDomains:
              gcp:
              - zone: us-central1-b
              - zone: us-central1-c
              - zone: us-central1-f
      3. Check machines
       

      Actual results:

      New master stuck in Provisioning status. 
       $ oc get machine   
      NAME                               PHASE          TYPE            REGION        ZONE            AGE
      zhsunxpn655-h7phl-master-0         Running        n2-standard-4   us-central1   us-central1-a   110m
      zhsunxpn655-h7phl-master-1         Running        n2-standard-4   us-central1   us-central1-b   110m
      zhsunxpn655-h7phl-master-2         Running        n2-standard-4   us-central1   us-central1-c   110m
      zhsunxpn655-h7phl-master-6v7l2-0   Provisioning   n2-standard-4   us-central1   us-central1-f   49m
      zhsunxpn655-h7phl-worker-a-8z46v   Running        n2-standard-4   us-central1   us-central1-a   99m
      zhsunxpn655-h7phl-worker-b-tx8vk   Running        n2-standard-4   us-central1   us-central1-b   99m
      
      $ oc logs -f machine-api-controllers-6c97579cb4-8gv5t -c machine-controller
      E0608 12:36:26.669870       1 actuator.go:54] zhsunxpn655-h7phl-master-6v7l2-0 error: zhsunxpn655-h7phl-master-6v7l2-0: reconciler failed to Update machine: failed to register instance to instance group: failed to ensure that instance group zhsunxpn655-h7phl-master-us-central1-f is a proper instance group: failed to register the new instance group named zhsunxpn655-h7phl-master-us-central1-f: instanceGroupInsert request failed: googleapi: Error 404: The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound
      E0608 12:36:26.669921       1 controller.go:284] zhsunxpn655-h7phl-master-6v7l2-0: error updating machine: zhsunxpn655-h7phl-master-6v7l2-0: reconciler failed to Update machine: failed to register instance to instance group: failed to ensure that instance group zhsunxpn655-h7phl-master-us-central1-f is a proper instance group: failed to register the new instance group named zhsunxpn655-h7phl-master-us-central1-f: instanceGroupInsert request failed: googleapi: Error 404: The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound, retrying in 30s seconds
      I0608 12:36:26.670078       1 recorder.go:103] events "msg"="zhsunxpn655-h7phl-master-6v7l2-0: reconciler failed to Update machine: failed to register instance to instance group: failed to ensure that instance group zhsunxpn655-h7phl-master-us-central1-f is a proper instance group: failed to register the new instance group named zhsunxpn655-h7phl-master-us-central1-f: instanceGroupInsert request failed: googleapi: Error 404: The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsunxpn655-h7phl-master-6v7l2-0","uid":"406bc9eb-d4f4-4235-b76b-26f886fc2a49","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"45549"} "reason"="FailedUpdate" "type"="Warning"

      Expected results:

      New master should be Running

      Additional info:

      Must-gather: https://drive.google.com/file/d/1KoHv-s71bG40t3W2g6aM1RDMn-sg8UTx/view?usp=sharing
      Similar with https://issues.redhat.com/browse/OCPBUGS-7366
      Maybe relates to https://issues.redhat.com/browse/OCPBUGS-5755

      Attachments

        Activity

          People

            rh-ee-nbrubake Nolan Brubaker
            rhn-support-zhsun Zhaohua Sun
            Zhaohua Sun Zhaohua Sun
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: