Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-6882

Machine should create failed when availabilityZone and subnet id is mismatch (AWS)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Minor Minor
    • 4.14.0
    • 4.13
    • None
    • Moderate
    • None
    • CLOUD Sprint 234, CLOUD Sprint 235, CLOUD Sprint 236
    • 3
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, when the availability zone and subnet ID in a machine set were mismatched, a machine was created successfully using the machine set specification with no indication to the user of the mismatch. Because the mismatched values can cause problems with some configurations, this occurrence should be visible as a warning message. With this release, a warning about the mismatch is logged. (link:https://issues.redhat.com/browse/OCPBUGS-6882[*OCPBUGS-6882*])
      Show
      * Previously, when the availability zone and subnet ID in a machine set were mismatched, a machine was created successfully using the machine set specification with no indication to the user of the mismatch. Because the mismatched values can cause problems with some configurations, this occurrence should be visible as a warning message. With this release, a warning about the mismatch is logged. (link: https://issues.redhat.com/browse/OCPBUGS-6882 [* OCPBUGS-6882 *])
    • Bug Fix
    • Done

      Description of problem:

      Machine should create failed when availabilityZone and subnet id is mismatch, 
      currently the machine create successfully when availabilityZone and subnet id is mismatch, and the cpms cannot be recreated after deleting.
      Another, for the subnet is filter, if availabilityZone and filter is mismatch, the machine will create failed.

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2023-01-31-072358

      How reproducible:

      always

      Steps to Reproduce:

      1.Create a machineset whose availabilityZone and subnet id is mismatch, for example, availabilityZone is us-east-2a, but the subnet id is for us-east-2b
      
                placement:
                  availabilityZone: us-east-2a
                  region: us-east-2
                securityGroups:
                - filters:
                  - name: tag:Name
                    values:
                    - huliu-aws1w-nk5xd-worker-sg
                subnet:
                  id: subnet-0107b4d7cfa35eb9b 
      
      2.Machine created successfully in us-east-2b zone
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                                                PHASE     TYPE         REGION      ZONE         AGE
      huliu-aws1w-nk5xd-master-0                          Running   m6i.xlarge   us-east-2   us-east-2a   62m
      huliu-aws1w-nk5xd-master-1                          Running   m6i.xlarge   us-east-2   us-east-2b   62m
      huliu-aws1w-nk5xd-master-2                          Running   m6i.xlarge   us-east-2   us-east-2a   62m
      huliu-aws1w-nk5xd-windows-worker-us-east-2a-689vq   Running   m5a.large    us-east-2   us-east-2b   37m
      huliu-aws1w-nk5xd-windows-worker-us-east-2a-nf9dl   Running   m5a.large    us-east-2   us-east-2b   37m
      huliu-aws1w-nk5xd-worker-us-east-2a-8kpht           Running   m6i.xlarge   us-east-2   us-east-2a   59m
      huliu-aws1w-nk5xd-worker-us-east-2a-dmtlc           Running   m6i.xlarge   us-east-2   us-east-2a   59m
      huliu-aws1w-nk5xd-worker-us-east-2b-kdn75           Running   m6i.xlarge   us-east-2   us-east-2b   59m
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine -o yaml |grep "id: subnet"
                id: subnet-0fef0e9e255742f3a
                id: subnet-0107b4d7cfa35eb9b
                id: subnet-0fef0e9e255742f3a
                id: subnet-0107b4d7cfa35eb9b
                id: subnet-0107b4d7cfa35eb9b
                id: subnet-0fef0e9e255742f3a
                id: subnet-0fef0e9e255742f3a
                id: subnet-0107b4d7cfa35eb9b 
      
      

      Actual results:

      Machine created successfully in the zone which the subnet id stands for, for the case it created in us-east-2b
      
      huliu-aws1w-nk5xd-windows-worker-us-east-2a-689vq   Running   m5a.large    us-east-2   us-east-2b   37m
      huliu-aws1w-nk5xd-windows-worker-us-east-2a-nf9dl   Running   m5a.large    us-east-2   us-east-2b   37m

      Expected results:

      Machine should create failed as availabilityZone and subnet id is mismatch

      Additional info:

      1. For the subnet is filter, if availabilityZone and filter is mismatch, the machine will create failed.
      
      huliu-aws1w2-x2tnx-worker-2-m4r8m            Failed                                          4s 
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine huliu-aws1w2-x2tnx-worker-2-m4r8m  -o yaml
      …
            placement:
              availabilityZone: us-east-2a
              region: us-east-2
            securityGroups:
            - filters:
              - name: tag:Name
                values:
                - huliu-aws1w2-x2tnx-worker-sg
            spotMarketOptions: {}
            subnet:
              filters:
              - name: tag:Name
                values:
                - huliu-aws1w2-x2tnx-private-us-east-2c
            tags:
            - name: kubernetes.io/cluster/huliu-aws1w2-x2tnx
              value: owned
            userDataSecret:
              name: worker-user-data
      status:
        conditions:
        - lastTransitionTime: "2023-02-01T02:45:52Z"
          status: "True"
          type: Drainable
        - lastTransitionTime: "2023-02-01T02:45:52Z"
          message: Instance has not been created
          reason: InstanceNotCreated
          severity: Warning
          status: "False"
          type: InstanceExists
        - lastTransitionTime: "2023-02-01T02:45:52Z"
          status: "True"
          type: Terminable
        errorMessage: 'error getting subnet IDs: no subnet IDs were found'
        errorReason: InvalidConfiguration
        lastUpdated: "2023-02-01T02:45:53Z"
        phase: Failed
        providerStatus:
          conditions:
          - lastTransitionTime: "2023-02-01T02:45:53Z"
            message: 'error getting subnet IDs: no subnet IDs were found'
            reason: MachineCreationFailed
            status: "False"
            type: MachineCreation
      
      2.For this case, machine create successfully when availabilityZone and subnet id is mismatch, the cpms cannot be recreated after deleting.
      
      liuhuali@Lius-MacBook-Pro huali-test % oc delete controlplanemachineset cluster 
      controlplanemachineset.machine.openshift.io "cluster" deleted
      liuhuali@Lius-MacBook-Pro huali-test % oc get controlplanemachineset                                
      No resources found in openshift-machine-api namespace.
      
      I0201 02:11:07.850022       1 http.go:143] controller-runtime/webhook/webhooks "msg"="wrote response" "UID"="12f118c4-fafe-45f9-bd24-876abdb8ba83" "allowed"=false "code"=403 "reason"="spec.template.machines_v1beta1_machine_openshift_io.failureDomains: Forbidden: no control plane machine is using specified failure domain(s) [AWSFailureDomain{AvailabilityZone:us-east-2a, Subnet:{Type:ID, Value:subnet-0107b4d7cfa35eb9b}}], failure domain(s) [AWSFailureDomain{AvailabilityZone:us-east-2a, Subnet:{Type:ID, Value:subnet-0fef0e9e255742f3a}}] are duplicated within the control plane machines, please correct failure domains to match control plane machines" "webhook"="/validate-machine-openshift-io-v1-controlplanemachineset"
      I0201 02:11:07.850787       1 controller.go:144]  "msg"="Finished reconciling control plane machine set" "controller"="controlplanemachinesetgenerator" "name"="cluster" "namespace"="openshift-machine-api" "reconcileID"="767c4631-ed83-47da-b316-29a21cdba245"
      E0201 02:11:07.850828       1 controller.go:326]  "msg"="Reconciler error" "error"="error reconciling control plane machine set: unable to create control plane machine set: unable to create control plane machine set: admission webhook \"controlplanemachineset.machine.openshift.io\" denied the request: spec.template.machines_v1beta1_machine_openshift_io.failureDomains: Forbidden: no control plane machine is using specified failure domain(s) [AWSFailureDomain{AvailabilityZone:us-east-2a, Subnet:{Type:ID, Value:subnet-0107b4d7cfa35eb9b}}], failure domain(s) [AWSFailureDomain{AvailabilityZone:us-east-2a, Subnet:{Type:ID, Value:subnet-0fef0e9e255742f3a}}] are duplicated within the control plane machines, please correct failure domains to match control plane machines" "controller"="controlplanemachinesetgenerator" "reconcileID"="767c4631-ed83-47da-b316-29a21cdba245"

            dodvarka@redhat.com Daniel Odvarka (Inactive)
            huliu@redhat.com Huali Liu
            Huali Liu Huali Liu
            Jeana Routh Jeana Routh
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: