Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4574

Machine stuck in no phase when creating in a nonexistent zone and stuck in Deleting when deleting on GCP

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • 4.13
    • None
    • Low
    • None
    • CLOUD Sprint 228, CLOUD Sprint 229
    • 2
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, a compute machine set for Google Cloud Platform could attempt to reconcile invalid machines, causing them to be stuck with no phase assigned. With this release, machines with an invalid configuration are put into the `Failed` state.
      (link:https://issues.redhat.com/browse/OCPBUGS-4574[*OCPBUGS-4574*])
      Show
      * Previously, a compute machine set for Google Cloud Platform could attempt to reconcile invalid machines, causing them to be stuck with no phase assigned. With this release, machines with an invalid configuration are put into the `Failed` state. (link: https://issues.redhat.com/browse/OCPBUGS-4574 [* OCPBUGS-4574 *])
    • Bug Fix
    • Done

      Description of problem:

      Machine stuck in no phase when creating in a nonexistent zone and stuck in Deleting when deleting on GCP

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2022-12-05-155739
      This can be also reproduced on older version(checked on 4.9, 4.11)

      How reproducible:

      Always

      Steps to Reproduce:

      1.Create a machineset in a nonexistent zone
      Copy a default machineset, change name, and change zone to a nonexistent zone, for example, us-central1-d
      
      liuhuali@Lius-MacBook-Pro huali-test % oc get machineset huliu-gcp413v2-r7dbx-worker-a -o yaml > ms4.yaml 
      liuhuali@Lius-MacBook-Pro huali-test % vim ms4.yaml 
      liuhuali@Lius-MacBook-Pro huali-test % oc create -f ms4.yaml 
      machineset.machine.openshift.io/huliu-gcp413v2-r7dbx-worker-d created
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                  PHASE      TYPE            REGION        ZONE            AGE
      huliu-gcp413v2-r7dbx-master-1         Running    n2-standard-4   us-central1   us-central1-b   96m
      huliu-gcp413v2-r7dbx-master-2         Running    n2-standard-4   us-central1   us-central1-c   96m
      huliu-gcp413v2-r7dbx-master-65hbs-0   Running    n2-standard-4   us-central1   us-central1-f   42m
      huliu-gcp413v2-r7dbx-master-n468m-1   Deleting                                                 16m
      huliu-gcp413v2-r7dbx-worker-a-5hdx8   Running    n2-standard-4   us-central1   us-central1-a   93m
      huliu-gcp413v2-r7dbx-worker-b-l6fz7   Running    n2-standard-4   us-central1   us-central1-b   93m
      huliu-gcp413v2-r7dbx-worker-c-g5m4k   Running    n2-standard-4   us-central1   us-central1-c   93m
      huliu-gcp413v2-r7dbx-worker-d-kx2t4                                                            3s
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine                                                    
      NAME                                  PHASE      TYPE            REGION        ZONE            AGE
      huliu-gcp413v2-r7dbx-master-1         Running    n2-standard-4   us-central1   us-central1-b   105m
      huliu-gcp413v2-r7dbx-master-2         Running    n2-standard-4   us-central1   us-central1-c   105m
      huliu-gcp413v2-r7dbx-master-65hbs-0   Running    n2-standard-4   us-central1   us-central1-f   51m
      huliu-gcp413v2-r7dbx-master-n468m-1   Deleting                                                 25m
      huliu-gcp413v2-r7dbx-worker-a-5hdx8   Running    n2-standard-4   us-central1   us-central1-a   102m
      huliu-gcp413v2-r7dbx-worker-b-l6fz7   Running    n2-standard-4   us-central1   us-central1-b   102m
      huliu-gcp413v2-r7dbx-worker-c-g5m4k   Running    n2-standard-4   us-central1   us-central1-c   102m
      huliu-gcp413v2-r7dbx-worker-d-kx2t4                                                            9m5s
      
      2.Delete the machineset
      liuhuali@Lius-MacBook-Pro huali-test % oc delete machineset huliu-gcp413v2-r7dbx-worker-d
      machineset.machine.openshift.io "huliu-gcp413v2-r7dbx-worker-d" deleted
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                  PHASE      TYPE            REGION        ZONE            AGE
      huliu-gcp413v2-r7dbx-master-1         Running    n2-standard-4   us-central1   us-central1-b   105m
      huliu-gcp413v2-r7dbx-master-2         Running    n2-standard-4   us-central1   us-central1-c   105m
      huliu-gcp413v2-r7dbx-master-65hbs-0   Running    n2-standard-4   us-central1   us-central1-f   51m
      huliu-gcp413v2-r7dbx-master-n468m-1   Deleting                                                 26m
      huliu-gcp413v2-r7dbx-worker-a-5hdx8   Running    n2-standard-4   us-central1   us-central1-a   102m
      huliu-gcp413v2-r7dbx-worker-b-l6fz7   Running    n2-standard-4   us-central1   us-central1-b   102m
      huliu-gcp413v2-r7dbx-worker-c-g5m4k   Running    n2-standard-4   us-central1   us-central1-c   102m
      huliu-gcp413v2-r7dbx-worker-d-kx2t4   Deleting                                                 9m21s
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                  PHASE      TYPE            REGION        ZONE            AGE
      huliu-gcp413v2-r7dbx-master-1         Running    n2-standard-4   us-central1   us-central1-b   3h4m
      huliu-gcp413v2-r7dbx-master-2         Running    n2-standard-4   us-central1   us-central1-c   3h4m
      huliu-gcp413v2-r7dbx-master-65hbs-0   Running    n2-standard-4   us-central1   us-central1-f   130m
      huliu-gcp413v2-r7dbx-master-n468m-1   Deleting                                                 105m
      huliu-gcp413v2-r7dbx-worker-a-5hdx8   Running    n2-standard-4   us-central1   us-central1-a   3h1m
      huliu-gcp413v2-r7dbx-worker-b-l6fz7   Running    n2-standard-4   us-central1   us-central1-b   3h1m
      huliu-gcp413v2-r7dbx-worker-c-g5m4k   Running    n2-standard-4   us-central1   us-central1-c   3h1m
      huliu-gcp413v2-r7dbx-worker-d-kx2t4   Deleting                                                 88m
      
      Some machine-controller logs:
      I1207 07:59:05.395164       1 actuator.go:138] huliu-gcp413v2-r7dbx-worker-d-kx2t4: Deleting machine
      E1207 07:59:05.521660       1 actuator.go:53] huliu-gcp413v2-r7dbx-worker-d-kx2t4 error: huliu-gcp413v2-r7dbx-worker-d-kx2t4: reconciler failed to Delete machine: huliu-gcp413v2-r7dbx-worker-d-kx2t4: Machine does not exist
      I1207 07:59:05.521708       1 controller.go:422] Actuator returned invalid configuration error: huliu-gcp413v2-r7dbx-worker-d-kx2t4: Machine does not exist
      I1207 07:59:05.521714       1 actuator.go:84] huliu-gcp413v2-r7dbx-worker-d-kx2t4: Checking if machine exists
      I1207 07:59:05.521849       1 recorder.go:103] events "msg"="huliu-gcp413v2-r7dbx-worker-d-kx2t4: reconciler failed to Delete machine: huliu-gcp413v2-r7dbx-worker-d-kx2t4: Machine does not exist" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"huliu-gcp413v2-r7dbx-worker-d-kx2t4","uid":"88a9f385-3350-4ddf-a451-e3603928f5d1","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"66351"} "reason"="FailedDelete" "type"="Warning"
      E1207 07:59:05.620961       1 controller.go:262] huliu-gcp413v2-r7dbx-worker-d-kx2t4: failed to check if machine exists: huliu-gcp413v2-r7dbx-worker-d-kx2t4: Machine does not exist
      E1207 07:59:05.621040       1 controller.go:326]  "msg"="Reconciler error" "error"="huliu-gcp413v2-r7dbx-worker-d-kx2t4: Machine does not exist" "controller"="machine-controller" "name"="huliu-gcp413v2-r7dbx-worker-d-kx2t4" "namespace"="openshift-machine-api" "object"={"name":"huliu-gcp413v2-r7dbx-worker-d-kx2t4","namespace":"openshift-machine-api"} "reconcileID"="8f8cb8e9-3757-4646-b579-8aa7f0974949"
      

      Actual results:

      Machine stuck in no phase when creating in a nonexistent zone, machine stuck in Deleting when deleting

      Expected results:

      Machine go into Failed phase when creating in a nonexistent zone, machine can be deleted successfully when deleting

      Additional info:

      Must-gather https://drive.google.com/file/d/1cUarMzvLPQToatAv4OgsjOvuo1udpUHs/view?usp=sharing
      
      This case works as expected on AWS and Azure.

              dodvarka@redhat.com Daniel Odvarka (Inactive)
              huliu@redhat.com Huali Liu
              Zhaohua Sun Zhaohua Sun
              Jeana Routh Jeana Routh
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: