Details
-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
4.13, 4.14
-
None
-
Moderate
-
No
-
Rejected
-
False
-
Description
Description of problem:
New machine stuck in Provisioning when delete one zone from cpms on gcp xpn, report "The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound"
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2023-06-05-112833
How reproducible:
Always
Steps to Reproduce:
1. Set up an gcp private cluster, by default cpms contains a,b,c,f four failureDomains, 3 masters are in a,b,c failureDomains: gcp: - zone: us-central1-a - zone: us-central1-b - zone: us-central1-c - zone: us-central1-f 2. Delete one failureDomain a, now failureDomains look like below: failureDomains: gcp: - zone: us-central1-b - zone: us-central1-c - zone: us-central1-f 3. Check machines
Actual results:
New master stuck in Provisioning status. $ oc get machine NAME PHASE TYPE REGION ZONE AGE zhsunxpn655-h7phl-master-0 Running n2-standard-4 us-central1 us-central1-a 110m zhsunxpn655-h7phl-master-1 Running n2-standard-4 us-central1 us-central1-b 110m zhsunxpn655-h7phl-master-2 Running n2-standard-4 us-central1 us-central1-c 110m zhsunxpn655-h7phl-master-6v7l2-0 Provisioning n2-standard-4 us-central1 us-central1-f 49m zhsunxpn655-h7phl-worker-a-8z46v Running n2-standard-4 us-central1 us-central1-a 99m zhsunxpn655-h7phl-worker-b-tx8vk Running n2-standard-4 us-central1 us-central1-b 99m $ oc logs -f machine-api-controllers-6c97579cb4-8gv5t -c machine-controller E0608 12:36:26.669870 1 actuator.go:54] zhsunxpn655-h7phl-master-6v7l2-0 error: zhsunxpn655-h7phl-master-6v7l2-0: reconciler failed to Update machine: failed to register instance to instance group: failed to ensure that instance group zhsunxpn655-h7phl-master-us-central1-f is a proper instance group: failed to register the new instance group named zhsunxpn655-h7phl-master-us-central1-f: instanceGroupInsert request failed: googleapi: Error 404: The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound E0608 12:36:26.669921 1 controller.go:284] zhsunxpn655-h7phl-master-6v7l2-0: error updating machine: zhsunxpn655-h7phl-master-6v7l2-0: reconciler failed to Update machine: failed to register instance to instance group: failed to ensure that instance group zhsunxpn655-h7phl-master-us-central1-f is a proper instance group: failed to register the new instance group named zhsunxpn655-h7phl-master-us-central1-f: instanceGroupInsert request failed: googleapi: Error 404: The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound, retrying in 30s seconds I0608 12:36:26.670078 1 recorder.go:103] events "msg"="zhsunxpn655-h7phl-master-6v7l2-0: reconciler failed to Update machine: failed to register instance to instance group: failed to ensure that instance group zhsunxpn655-h7phl-master-us-central1-f is a proper instance group: failed to register the new instance group named zhsunxpn655-h7phl-master-us-central1-f: instanceGroupInsert request failed: googleapi: Error 404: The resource 'projects/openshift-qe/global/networks/installer-shared-vpc' was not found, notFound" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsunxpn655-h7phl-master-6v7l2-0","uid":"406bc9eb-d4f4-4235-b76b-26f886fc2a49","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"45549"} "reason"="FailedUpdate" "type"="Warning"
Expected results:
New master should be Running
Additional info:
Must-gather: https://drive.google.com/file/d/1KoHv-s71bG40t3W2g6aM1RDMn-sg8UTx/view?usp=sharing Similar with https://issues.redhat.com/browse/OCPBUGS-7366 Maybe relates to https://issues.redhat.com/browse/OCPBUGS-5755