Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-42823

Provision an addition disk via MachineSet on GCP ends in an error

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

          I want to create an addition disk using MachineSet API on GCP. Machine stays in provisioning due to disk setup failure.

       

      Version-Release number of selected component (if applicable):

          4.18 for sure. I did not test previous versions but this doesn't seem like a regression to me.

      How reproducible:

          All the time

      Steps to Reproduce:

          1. Create cluster-bot (4.18 gcp)
          2. Copy worker machineset
          3. Add extra disk to a machineSet and create the machineset.
          
                disks:
                - autoDelete: true
                  boot: true
                  image: projects/rhcos-cloud/global/images/rhcos-418-94-202409162337-0-gcp-x86-64
                  sizeGb: 128
                  type: pd-ssd
                - autoDelete: true
                  boot: false
                  image: projects/rhcos-cloud/global/images/rhcos-418-94-202409162337-0-gcp-x86-64
                  sizeGb: 128
                  type: pd-ssd
                  labels:
                    split: image-testing

       

      Actual results:

          Machine is stuck in provisioning state.
      Console logs state GCP is trying to create a disk with the same device name.
      
      [  OK  ] Reached target Initrd Root Device.
      [    7.398808] systemd[1]: Starting CoreOS Ignition Ensure Unique Boot Filesystem...
               Starting CoreOS Ignition Ensure Unique Boot Filesystem...
      [    7.409635] GPT:Primary header thinks Alt. header is not at the end of the disk.
      [    7.410760] GPT:7131135 != 268435455
      [    7.411306] GPT:Alternate GPT header not at the end of the disk.
      [    7.412168] GPT:7131135 != 268435455
      [    7.412730] GPT: Use GNU Parted to correct GPT errors.
      [    7.413600]  sda: sda1 sda2 sda3 sda4
      [    7.422891]  sdb: sdb1 sdb2 sdb3 sdb4
      [    7.835329] rdcore[1051]: Error: System has 2 devices with a filesystem labeled 'boot': ["/dev/sdb3", "/dev/sda3"]
      [FAILED[    7.837561] systemd[1]: coreos-ignition-unique-boot.service: Main process exited, code=exited, status=1/FAILURE
      ] Failed to [    7.840574] systemd[1]: coreos-ignition-unique-boot.service: Failed with result 'exit-code'.
      start C[    7.843779] systemd[1]: Failed to start CoreOS Ignition Ensure Unique Boot Filesystem.
      

      Expected results:

          

      Additional info:

          apiVersion: machine.openshift.io/v1beta1
      kind: MachineSet
      metadata:
        annotations:
          capacity.cluster-autoscaler.kubernetes.io/labels: kubernetes.io/arch=amd64
          machine.openshift.io/GPU: "0"
          machine.openshift.io/memoryMb: "16384"
          machine.openshift.io/vCPU: "4"
        creationTimestamp: "2024-10-04T18:30:15Z"
        generation: 1
        labels:
          machine.openshift.io/cluster-api-cluster: ci-ln-xglq1h2-72292-89825
        name: ci-ln-xglq1h2-72292-89825-worker-test
        namespace: openshift-machine-api
      spec:
        replicas: 1
        selector:
          matchLabels:
            machine.openshift.io/cluster-api-cluster: ci-ln-xglq1h2-72292-89825
            machine.openshift.io/cluster-api-machineset: ci-ln-xglq1h2-72292-89825-worker-test
        template:
          metadata:
            labels:
              machine.openshift.io/cluster-api-cluster: ci-ln-xglq1h2-72292-89825
              machine.openshift.io/cluster-api-machine-role: worker
              machine.openshift.io/cluster-api-machine-type: worker
              machine.openshift.io/cluster-api-machineset: ci-ln-xglq1h2-72292-89825-worker-test
          spec:
            lifecycleHooks: {}
            metadata: {}
            providerSpec:
              value:
                apiVersion: machine.openshift.io/v1beta1
                canIPForward: false
                credentialsSecret:
                  name: gcp-cloud-credentials
                deletionProtection: false
                disks:
                - autoDelete: true
                  boot: true
                  image: projects/rhcos-cloud/global/images/rhcos-418-94-202409162337-0-gcp-x86-64
                  sizeGb: 128
                  type: pd-ssd
                - autoDelete: true
                  boot: false
                  image: projects/rhcos-cloud/global/images/rhcos-418-94-202409162337-0-gcp-x86-64
                  sizeGb: 128
                  type: pd-ssd
                  labels:
                    split: image-testing
                kind: GCPMachineProviderSpec
                machineType: e2-standard-4
                metadata:
                  creationTimestamp: null
                networkInterfaces:
                - network: ci-ln-xglq1h2-72292-89825-network
                  subnetwork: ci-ln-xglq1h2-72292-89825-worker-subnet
                projectID: openshift-gce-devel-ci
                region: us-central1
                serviceAccounts:
                - email: ci-ln-xglq1h2-72292-89825-w@openshift-gce-devel-ci.iam.gserviceaccount.com
                  scopes:
                  - https://www.googleapis.com/auth/cloud-platform
                shieldedInstanceConfig: {}
                tags:
                - ci-ln-xglq1h2-72292-89825-worker
                userDataSecret:
                  name: worker-user-data
                zone: us-central1-a
      

              rh-ee-tbarberb Theo Barber-Bany
              rh-ee-kehannon Kevin Hannon
              Milind Yadav Milind Yadav
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: