Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17079

Machine scale failed for GCP Marketplace cluster after upgrade from 4.12 to 4.13

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • 4.13
    • None
    • Moderate
    • No
    • CLOUD Sprint 248, CLOUD Sprint 249, CLOUD Sprint 250, CLOUD Sprint 251, CLOUD Sprint 252, CLOUD Sprint 253, CLOUD Sprint 254, CLOUD Sprint 255, CLOUD Sprint 256, CLOUD Sprint 257, CLOUD Sprint 258, CLOUD Sprint 259, CLOUD Sprint 260, CLOUD Sprint 261, CLOUD Sprint 262
    • 15
    • Rejected
    • False
    • Hide

      Regression in behaviour for clusters using GCP marketplace images, not a blocker since this has already shipped, but should be fixed promptly

      Show
      Regression in behaviour for clusters using GCP marketplace images, not a blocker since this has already shipped, but should be fixed promptly
    • Hide
      * When upgrading GCP clusters that use a boot disk that is not compatible with UEFI, shielded VM support cannot be enabled. Previously, this prevented the creation of new machines. With this release, shielded VM support is disabled for disks that are known to be incompatible with UEFI. This primarily affects customers upgrading from {product-title} version 4.12 to 4.13 using the GCP marketplace images. (link:https://issues.redhat.com/browse/OCPBUGS-17079[*OCPBUGS-17079*])
      Show
      * When upgrading GCP clusters that use a boot disk that is not compatible with UEFI, shielded VM support cannot be enabled. Previously, this prevented the creation of new machines. With this release, shielded VM support is disabled for disks that are known to be incompatible with UEFI. This primarily affects customers upgrading from {product-title} version 4.12 to 4.13 using the GCP marketplace images. (link: https://issues.redhat.com/browse/OCPBUGS-17079 [* OCPBUGS-17079 *])
    • Bug Fix
    • Done

      Description of problem:

      Machine scale failed for GCP Marketplace cluster after upgrade from 4.12 to 4.13

      Version-Release number of selected component (if applicable):

      Upgrade from 4.12.26 to 4.13.0-0.nightly-2023-07-27-013427

      How reproducible:

      Always

      Steps to Reproduce:

      1.Install a 4.12 GCP Marketplace cluster
      liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion    
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.12.26   True        False         24m     Cluster version is 4.12.26
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                               PHASE     TYPE            REGION        ZONE            AGE
      huliu-41142-4cd9z-master-0         Running   n2-standard-4   us-central1   us-central1-a   48m
      huliu-41142-4cd9z-master-1         Running   n2-standard-4   us-central1   us-central1-b   48m
      huliu-41142-4cd9z-master-2         Running   n2-standard-4   us-central1   us-central1-c   48m
      huliu-41142-4cd9z-worker-a-z772h   Running   n2-standard-4   us-central1   us-central1-a   46m
      huliu-41142-4cd9z-worker-b-7vb9n   Running   n2-standard-4   us-central1   us-central1-b   46m 
      
      2.Upgrade to 4.13
      liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion
      NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.13.0-0.nightly-2023-07-27-013427   True        False         15m     Cluster version is 4.13.0-0.nightly-2023-07-27-013427
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                               PHASE     TYPE            REGION        ZONE            AGE
      huliu-41142-4cd9z-master-0         Running   n2-standard-4   us-central1   us-central1-a   175m
      huliu-41142-4cd9z-master-1         Running   n2-standard-4   us-central1   us-central1-b   175m
      huliu-41142-4cd9z-master-2         Running   n2-standard-4   us-central1   us-central1-c   175m
      huliu-41142-4cd9z-worker-a-z772h   Running   n2-standard-4   us-central1   us-central1-a   172m
      huliu-41142-4cd9z-worker-b-7vb9n   Running   n2-standard-4   us-central1   us-central1-b   172m 
      
      3.Scale a machineset
      liuhuali@Lius-MacBook-Pro huali-test % oc scale machineset huliu-41142-4cd9z-worker-a --replicas=2
      machineset.machine.openshift.io/huliu-41142-4cd9z-worker-a scaled
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine
      NAME                               PHASE     TYPE            REGION        ZONE            AGE
      huliu-41142-4cd9z-master-0         Running   n2-standard-4   us-central1   us-central1-a   5h35m
      huliu-41142-4cd9z-master-1         Running   n2-standard-4   us-central1   us-central1-b   5h35m
      huliu-41142-4cd9z-master-2         Running   n2-standard-4   us-central1   us-central1-c   5h35m
      huliu-41142-4cd9z-worker-a-pdzg2   Failed                                                  113s
      huliu-41142-4cd9z-worker-a-z772h   Running   n2-standard-4   us-central1   us-central1-a   5h33m
      huliu-41142-4cd9z-worker-b-7vb9n   Running   n2-standard-4   us-central1   us-central1-b   5h33m
      liuhuali@Lius-MacBook-Pro huali-test % oc get machine huliu-41142-4cd9z-worker-a-pdzg2  -oyaml
      apiVersion: machine.openshift.io/v1beta1
      kind: Machine
      metadata:
        annotations:
          machine.openshift.io/instance-state: Unknown
        creationTimestamp: "2023-07-31T07:42:44Z"
        finalizers:
        - machine.machine.openshift.io
        generateName: huliu-41142-4cd9z-worker-a-
        generation: 1
        labels:
          machine.openshift.io/cluster-api-cluster: huliu-41142-4cd9z
          machine.openshift.io/cluster-api-machine-role: worker
          machine.openshift.io/cluster-api-machine-type: worker
          machine.openshift.io/cluster-api-machineset: huliu-41142-4cd9z-worker-a
        name: huliu-41142-4cd9z-worker-a-pdzg2
        namespace: openshift-machine-api
        ownerReferences:
        - apiVersion: machine.openshift.io/v1beta1
          blockOwnerDeletion: true
          controller: true
          kind: MachineSet
          name: huliu-41142-4cd9z-worker-a
          uid: 43046eac-5ff5-4810-8e20-f0332128410f
        resourceVersion: "163107"
        uid: 1cd7d4d2-f231-457c-b21b-4ebc2d27363e
      spec:
        lifecycleHooks: {}
        metadata: {}
        providerSpec:
          value:
            apiVersion: machine.openshift.io/v1beta1
            canIPForward: false
            credentialsSecret:
              name: gcp-cloud-credentials
            deletionProtection: false
            disks:
            - autoDelete: true
              boot: true
              image: projects/redhat-marketplace-public/global/images/redhat-coreos-ocp-48-x86-64-202210040145
              labels: null
              sizeGb: 128
              type: pd-ssd
            kind: GCPMachineProviderSpec
            machineType: n2-standard-4
            metadata:
              creationTimestamp: null
            networkInterfaces:
            - network: huliu-41142-4cd9z-network
              subnetwork: huliu-41142-4cd9z-worker-subnet
            projectID: openshift-qe
            region: us-central1
            serviceAccounts:
            - email: huliu-41142-4cd9z-w@openshift-qe.iam.gserviceaccount.com
              scopes:
              - https://www.googleapis.com/auth/cloud-platform
            shieldedInstanceConfig: {}
            tags:
            - huliu-41142-4cd9z-worker
            userDataSecret:
              name: worker-user-data
            zone: us-central1-a
      status:
        conditions:
        - lastTransitionTime: "2023-07-31T07:42:44Z"
          status: "True"
          type: Drainable
        - lastTransitionTime: "2023-07-31T07:42:44Z"
          message: Instance has not been created
          reason: InstanceNotCreated
          severity: Warning
          status: "False"
          type: InstanceExists
        - lastTransitionTime: "2023-07-31T07:42:44Z"
          status: "True"
          type: Terminable
        errorMessage: 'error launching instance: googleapi: Error 400: Invalid value for
          field ''resource.shieldedInstanceConfig'': ''{  "enableVtpm": true,  "enableIntegrityMonitoring":
          true}''. Shielded VM Config can only be set when using a UEFI-compatible disk.,
          invalid'
        errorReason: InvalidConfiguration
        lastUpdated: "2023-07-31T07:42:50Z"
        phase: Failed
        providerStatus:
          conditions:
          - lastTransitionTime: "2023-07-31T07:42:50Z"
            message: 'googleapi: Error 400: Invalid value for field ''resource.shieldedInstanceConfig'':
              ''{  "enableVtpm": true,  "enableIntegrityMonitoring": true}''. Shielded VM
              Config can only be set when using a UEFI-compatible disk., invalid'
            reason: MachineCreationFailed
            status: "False"
            type: MachineCreated
          metadata: {}
      
      liuhuali@Lius-MacBook-Pro huali-test % oc get machineset huliu-41142-4cd9z-worker-a -oyaml
      apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      metadata:
        annotations:
          machine.openshift.io/GPU: "0"
          machine.openshift.io/memoryMb: "16384"
          machine.openshift.io/vCPU: "4"
        creationTimestamp: "2023-07-31T02:09:14Z"
        generation: 2
        labels:
          machine.openshift.io/cluster-api-cluster: huliu-41142-4cd9z
        name: huliu-41142-4cd9z-worker-a
        namespace: openshift-machine-api
        resourceVersion: "163067"
        uid: 43046eac-5ff5-4810-8e20-f0332128410f
      spec:
        replicas: 2
        selector:
          matchLabels:
            machine.openshift.io/cluster-api-cluster: huliu-41142-4cd9z
            machine.openshift.io/cluster-api-machineset: huliu-41142-4cd9z-worker-a
        template:
          metadata:
            labels:
              machine.openshift.io/cluster-api-cluster: huliu-41142-4cd9z
              machine.openshift.io/cluster-api-machine-role: worker
              machine.openshift.io/cluster-api-machine-type: worker
              machine.openshift.io/cluster-api-machineset: huliu-41142-4cd9z-worker-a
          spec:
            lifecycleHooks: {}
            metadata: {}
            providerSpec:
              value:
                apiVersion: machine.openshift.io/v1beta1
                canIPForward: false
                credentialsSecret:
                  name: gcp-cloud-credentials
                deletionProtection: false
                disks:
                - autoDelete: true
                  boot: true
                  image: projects/redhat-marketplace-public/global/images/redhat-coreos-ocp-48-x86-64-202210040145
                  labels: null
                  sizeGb: 128
                  type: pd-ssd
                kind: GCPMachineProviderSpec
                machineType: n2-standard-4
                metadata:
                  creationTimestamp: null
                networkInterfaces:
                - network: huliu-41142-4cd9z-network
                  subnetwork: huliu-41142-4cd9z-worker-subnet
                projectID: openshift-qe
                region: us-central1
                serviceAccounts:
                - email: huliu-41142-4cd9z-w@openshift-qe.iam.gserviceaccount.com
                  scopes:
                  - https://www.googleapis.com/auth/cloud-platform
                tags:
                - huliu-41142-4cd9z-worker
                userDataSecret:
                  name: worker-user-data
                zone: us-central1-a
      status:
        availableReplicas: 1
        fullyLabeledReplicas: 2
        observedGeneration: 2
        readyReplicas: 1
        replicas: 2
       

      Actual results:

      Machine scale Failed

      Expected results:

      Machine should get Running, it shouldn’t validation when Shielded VM Config is not set.

      Additional info:

      Although we found this bug https://issues.redhat.com/browse/OCPBUGS-7367, but for the upgrade, the users didn’t set the parameter (shieldedInstanceConfig), didn’t want to use the feature either, but they cannot scale up the old machineset. That’s not convenient.

              rh-ee-nbrubake Nolan Brubaker
              huliu@redhat.com Huali Liu
              Huali Liu Huali Liu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: