Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-1348

OnDelete strategy machine should be failed if the backend infrastructure isn't there

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • 4.12
    • None
    • None
    • CLOUD Sprint 226
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      In a single zone(us-east-2a) cluster, modify controlplanemachineset to use OnDelete strategy and add another two zones: us-east-2b and us-east-2c,  the backend subnets for us-east-2b and us-east-2c aren't actually configured. If delete one master machine, even if the machine is up to date and not in need of replacement, a new master will be created in us-east-2a, the old master will be deleted.  

      Version-Release number of selected component (if applicable):

      4.12.0-0.nightly-2022-09-12-152748

      How reproducible:

      always

      Steps to Reproduce:

      1. Setup a single zone cluster
      2. Modify controlplanemachineset to use OnDelete strategy and add another two zones: us-east-2b and us-east-2c 3
      $ oc edit controlplanemachineset cluster 
        strategy:
          type: OnDelete
        template:
          machineType: machines_v1beta1_machine_openshift_io
          machines_v1beta1_machine_openshift_io:       
            failureDomains:
              platform: AWS
              aws:
              - placement:
                  availabilityZone: us-east-2a
                subnet:
                  type: Filters
                  filters:
                  - name: tag:Name
                    values:
                    - zhsunaws991-9n7r7-private-us-east-2a
              - placement:
                  availabilityZone: us-east-2b
                subnet:
                  type: Filters
                  filters:
                  - name: tag:Name
                    values:
                    - zhsunaws991-9n7r7-private-us-east-2b
              - placement:
                  availabilityZone: us-east-2c
                subnet:
                  type: Filters
                  filters:
                  - name: tag:Name
                    values:
                    - zhsunaws991-9n7r7-private-us-east-2c  
      3. Delete a master machine, the machine is up to date and not in need of replacement 
      
      

      Actual results:

      A new master will be created in us-east-2a instead of us-east-2b and us-east-2c, the old master will be deleted. 
      
      $ oc get machine                         
      NAME                                        PHASE      TYPE         REGION      ZONE         AGE
      zhsunaws991-9n7r7-master-0                  Deleting   m6i.xlarge   us-east-2   us-east-2a   71m
      zhsunaws991-9n7r7-master-1                  Running    m6i.xlarge   us-east-2   us-east-2a   71m
      zhsunaws991-9n7r7-master-2                  Running    m6i.xlarge   us-east-2   us-east-2a   71m
      zhsunaws991-9n7r7-master-9cwsg-0            Running    m6i.xlarge   us-east-2   us-east-2a   4m43s
      zhsunaws991-9n7r7-worker-us-east-2a-cgrcf   Running    m6i.xlarge   us-east-2   us-east-2a   68m
      zhsunaws991-9n7r7-worker-us-east-2a-jslhj   Running    m6i.xlarge   us-east-2   us-east-2a   68m
      zhsunaws991-9n7r7-worker-us-east-2a-xgh8l   Running    m6i.xlarge   us-east-2   us-east-2a   68m
      
      $ oc get machine               
      NAME                                        PHASE     TYPE         REGION      ZONE         AGE
      zhsunaws991-9n7r7-master-1                  Running   m6i.xlarge   us-east-2   us-east-2a   94m
      zhsunaws991-9n7r7-master-2                  Running   m6i.xlarge   us-east-2   us-east-2a   94m
      zhsunaws991-9n7r7-master-9cwsg-0            Running   m6i.xlarge   us-east-2   us-east-2a   27m
      zhsunaws991-9n7r7-worker-us-east-2a-cgrcf   Running   m6i.xlarge   us-east-2   us-east-2a   91m
      zhsunaws991-9n7r7-worker-us-east-2a-jslhj   Running   m6i.xlarge   us-east-2   us-east-2a   91m
      zhsunaws991-9n7r7-worker-us-east-2a-xgh8l   Running   m6i.xlarge   us-east-2   us-east-2a   91m

      Expected results:

      The current machine is up to date and not in need of replacement. The other two failure domains it will say need replacement and then we would expect those to fail if the backend infrastructure isn't there

      Additional info:

      https://issues.redhat.com/browse/OCPCLOUD-1503?focusedCommentId=20945295&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-20945295

              rmanak@redhat.com Radek Manak
              rhn-support-zhsun Zhaohua Sun
              Zhaohua Sun Zhaohua Sun
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: