Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-2340

OnDelete update strategy cannot work when master machines are not index as 0, 1, 2


    • Moderate
    • None
    • CLOUD Sprint 226
    • 1
    • Rejected
    • False
    • N/A

      Description of problem:

      OnDelete update strategy cannot work when master machines are  not index as 0, 1, 2

      Version-Release number of selected component (if applicable):


      How reproducible:


      Steps to Reproduce:

      1.Change the master machines name to 3, 4, 5
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                       PHASE     TYPE         REGION      ZONE         AGE
      huliu-awso-cnr4j-master-3                  Running   m6i.xlarge   us-east-2   us-east-2a   44m
      huliu-awso-cnr4j-master-4                  Running   m6i.xlarge   us-east-2   us-east-2b   33m
      huliu-awso-cnr4j-master-5                  Running   m6i.xlarge   us-east-2   us-east-2c   17m
      huliu-awso-cnr4j-worker-us-east-2a-7p5c9   Running   m6i.xlarge   us-east-2   us-east-2a   173m
      huliu-awso-cnr4j-worker-us-east-2b-fmk56   Running   m6i.xlarge   us-east-2   us-east-2b   173m
      huliu-awso-cnr4j-worker-us-east-2c-w6n78   Running   m6i.xlarge   us-east-2   us-east-2c   173m
      2.Create cpms, update strategy is OnDelete
      liuhuali@Lius-MacBook-Pro huali-test % oc create -f cpms3.yaml 
      controlplanemachineset.machine.openshift.io/cluster created
      liuhuali@Lius-MacBook-Pro huali-test % oc get controlplanemachineset
      cluster   3         3         3       3                       Active   60s
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                       PHASE     TYPE         REGION      ZONE         AGE
      huliu-awso-cnr4j-master-3                  Running   m6i.xlarge   us-east-2   us-east-2a   45m
      huliu-awso-cnr4j-master-4                  Running   m6i.xlarge   us-east-2   us-east-2b   34m
      huliu-awso-cnr4j-master-5                  Running   m6i.xlarge   us-east-2   us-east-2c   19m
      huliu-awso-cnr4j-worker-us-east-2a-7p5c9   Running   m6i.xlarge   us-east-2   us-east-2a   174m
      huliu-awso-cnr4j-worker-us-east-2b-fmk56   Running   m6i.xlarge   us-east-2   us-east-2b   174m
      huliu-awso-cnr4j-worker-us-east-2c-w6n78   Running   m6i.xlarge   us-east-2   us-east-2c   174m
      liuhuali@Lius-MacBook-Pro huali-test % oc get co
      NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      control-plane-machine-set                  4.12.0-0.nightly-2022-10-10-015203   True        False         False      165m    
      3.Edit CPMS, change instanceType to another value, here changed to m5.2xlarge
      liuhuali@Lius-MacBook-Pro huali-test % oc edit controlplanemachineset cluster
      controlplanemachineset.machine.openshift.io/cluster edited
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                       PHASE     TYPE         REGION      ZONE         AGE
      huliu-awso-cnr4j-master-3                  Running   m6i.xlarge   us-east-2   us-east-2a   46m
      huliu-awso-cnr4j-master-4                  Running   m6i.xlarge   us-east-2   us-east-2b   35m
      huliu-awso-cnr4j-master-5                  Running   m6i.xlarge   us-east-2   us-east-2c   19m
      huliu-awso-cnr4j-worker-us-east-2a-7p5c9   Running   m6i.xlarge   us-east-2   us-east-2a   175m
      huliu-awso-cnr4j-worker-us-east-2b-fmk56   Running   m6i.xlarge   us-east-2   us-east-2b   175m
      huliu-awso-cnr4j-worker-us-east-2c-w6n78   Running   m6i.xlarge   us-east-2   us-east-2c   175m
      liuhuali@Lius-MacBook-Pro huali-test % oc get controlplanemachineset
      cluster   3         3         3                               Active   114s
      liuhuali@Lius-MacBook-Pro huali-test % oc get co
      NAME                                       VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      control-plane-machine-set                  4.12.0-0.nightly-2022-10-10-015203   True        True          False      167m    Observed 3 replica(s) in need of update
      4.Delete a master machine
      liuhuali@Lius-MacBook-Pro huali-test % oc delete machine huliu-awso-cnr4j-master-3 
      machine.machine.openshift.io "huliu-awso-cnr4j-master-3" deleted
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                       PHASE      TYPE         REGION      ZONE         AGE
      huliu-awso-cnr4j-master-3                  Deleting   m6i.xlarge   us-east-2   us-east-2a   49m
      huliu-awso-cnr4j-master-4                  Running    m6i.xlarge   us-east-2   us-east-2b   38m
      huliu-awso-cnr4j-master-5                  Running    m6i.xlarge   us-east-2   us-east-2c   22m
      huliu-awso-cnr4j-worker-us-east-2a-7p5c9   Running    m6i.xlarge   us-east-2   us-east-2a   177m
      huliu-awso-cnr4j-worker-us-east-2b-fmk56   Running    m6i.xlarge   us-east-2   us-east-2b   177m
      huliu-awso-cnr4j-worker-us-east-2c-w6n78   Running    m6i.xlarge   us-east-2   us-east-2c   177m
      liuhuali@Lius-MacBook-Pro huali-test % oc logs control-plane-machine-set-operator-75d75dccbd-r4ftb
      I1013 11:41:52.932759       1 status.go:111]  "msg"="Observed Machine Configuration" "controller"="controlplanemachineset" "name"="cluster" "namespace"="openshift-machine-api" "observedGeneration"=2 "readyReplicas"=3 "reconcileID"="f8ebe234-5e1d-469c-a2d0-808ecb785ad1" "replicas"=3 "unavailableReplicas"=0 "updatedReplicas"=0
      E1013 11:41:52.932951       1 updates.go:441]  "msg"="Error creating machine" "error"="error creating new Machine for index 0: could not get provider config for index 0: cannot inject failure domain in the provider config: failure domain is nil" "controller"="controlplanemachineset" "index"=3 "name"="huliu-awso-cnr4j-master-3" "namespace"="openshift-machine-api" "reconcileID"="f8ebe234-5e1d-469c-a2d0-808ecb785ad1" "updateStrategy"="OnDelete"
      I1013 11:41:52.933317       1 controller.go:178]  "msg"="Finished reconciling control plane machine set" "controller"="controlplanemachineset" "name"="cluster" "namespace"="openshift-machine-api" "reconcileID"="f8ebe234-5e1d-469c-a2d0-808ecb785ad1"
      E1013 11:41:52.933353       1 controller.go:326]  "msg"="Reconciler error" "error"="error reconciling control plane machine set: error reconciling machines: error reconciling machine updates: error creating new Machine for index 0: could not get provider config for index 0: cannot inject failure domain in the provider config: failure domain is nil" "controller"="controlplanemachineset" "reconcileID"="f8ebe234-5e1d-469c-a2d0-808ecb785ad1"
      I1013 11:42:33.894074       1 controller.go:128]  "msg"="Reconciling control plane machine set" "controller"="controlplanemachineset" "name"="cluster" "namespace"="openshift-machine-api" "reconcileID"="be3b87fa-ffff-47f9-a4c4-a13cc077a897"
      E1013 11:42:33.894609       1 provider.go:242]  "msg"="Unknown Index" "error"="could not find failure domain for index: unknown index 3" "controller"="controlplanemachineset" "name"="cluster" "namespace"="openshift-machine-api" "reconcileID"="be3b87fa-ffff-47f9-a4c4-a13cc077a897" 

      Actual results:

      OnDelete update strategy cannot work when master machines are not index as 0, 1, 2

      Expected results:

      OnDelete update strategy should work when master machines are not index as 0, 1, 2

      Additional info:

      RollingUpdate update strategy work right when master machines are not index as 0, 1, 2

            joelspeed Joel Speed
            huliu@redhat.com Huali Liu
            Huali Liu Huali Liu
            0 Vote for this issue
            3 Start watching this issue
