Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-13161

SiteConfig disk partition definition fails when applied to multiple nodes in a cluster

XMLWordPrintable

    • Important
    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      5/5: fix is posted and waiting on the 4.14 fix to be tested. We need a zStream forecast.
      03/16: this should not gate 4.13 but should be release noted
      Rel Note for Telco: Yes
      Show
      5/5: fix is posted and waiting on the 4.14 fix to be tested. We need a zStream forecast. 03/16: this should not gate 4.13 but should be release noted Rel Note for Telco: Yes

      This is a clone of issue OCPBUGS-9272. The following is the description of the original issue:

      Description of problem:
      Considering that diskPartition is a node setting, the disk partition definition fails when more than one node is configured with a diskPartition in a cluster.

      In my tests I see:

      1. If diskPartition only targets SNOs there is no check/validation that the user is creating a SNO cluster
      2. If diskPartition is expected to target more than one node in a compact cluster which is reasonable (e.g. TALO recovery partition, mount point for etcd or container storage) the policygentools fails to render.
      3. If diskPartition is expected to target more than one node, you can still do it by only configuring diskPartition of one node of a cluster. The partitioning will be applied to all of them, at least on a compact cluster. In that case, I expect the diskPartition setting to be a cluster setting instead of a node setting.

      Version-Release number of selected component (if applicable):
      4.10

      How reproducible:
      Always. Create a simple diskPartition configuration in a multi/compact cluster configuration using a siteConfig.

      Steps to Reproduce:
      1. Create a siteConfig to provisioning a compact cluster, e.g. 3 nodes
      2. Create a valid diskPartition config in each of the nodes, for instance:

      #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
      diskPartition:

      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000

      ```
      nodes:

      • hostName: "master-00.cnf23.e2e.bos.redhat.com"
        role: "master"
        bmcAddress: "idrac-virtualmedia+https://10.19.28.53/redfish/v1/Systems/System.Embedded.1"
        bmcCredentialsName:
        name: "worker0-bmh-secret"
        bootMACAddress: "e4:43:4b:bd:90:46"
        bootMode: "UEFI"
        rootDeviceHints:
        hctl: "0:2:0:0"
        deviceName: /dev/sda
        #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
        diskPartition:
      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000
        nodeNetwork:
        config:
        interfaces:
      • name: ens1f0
        type: ethernet
        state: up
        macAddress: "e4:43:4b:bd:90:46"
        ipv4:
        enabled: true
        dhcp: true
        auto-dns: false
        ipv6:
        enabled: false
        dns-resolver:
        config:
        server:
      • 10.19.143.247
        interfaces:
      • name: "ens1f0"
        macAddress: "e4:43:4b:bd:90:46"
      • hostName: "master-01.cnf23.e2e.bos.redhat.com"
        role: "master"
        bmcAddress: "idrac-virtualmedia+https://10.19.28.54/redfish/v1/Systems/System.Embedded.1"
        bmcCredentialsName:
        name: "worker1-bmh-secret"
        bootMACAddress: "e4:43:4b:bd:92:b8"
        bootMode: "UEFI"
        rootDeviceHints:
        hctl: "0:2:0:0"
        deviceName: /dev/sda
        #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
        diskPartition:
      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000
        nodeNetwork:
        config:
        interfaces:
      • name: ens1f0
        type: ethernet
        state: up
        macAddress: "e4:43:4b:bd:92:b8"
        ipv4:
        enabled: true
        dhcp: true
        auto-dns: false
        ipv6:
        enabled: false
        dns-resolver:
        config:
        server:
      • 10.19.143.247
        interfaces:
      • name: "ens1f0"
        macAddress: "e4:43:4b:bd:92:b8"
      • hostName: "master-02.cnf23.e2e.bos.redhat.com"
        role: "master"
        bmcAddress: "idrac-virtualmedia+https://10.19.28.55/redfish/v1/Systems/System.Embedded.1"
        bmcCredentialsName:
        name: "worker2-bmh-secret"
        bootMACAddress: "e4:43:4b:bd:90:9a"
        bootMode: "UEFI"
        rootDeviceHints:
        hctl: "0:2:0:0"
        #deviceName: /dev/sda
        #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
        diskPartition:
      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000
        nodeNetwork:
        config:
        interfaces:
      • name: ens1f0
        type: ethernet
        state: up
        macAddress: "e4:43:4b:bd:90:9a"
        ipv4:
        enabled: true
        dhcp: true
        auto-dns: false
        ipv6:
        enabled: false
        dns-resolver:
        config:
        server:
      • 10.19.143.247
        interfaces:
      • name: "ens1f0"
        macAddress: "e4:43:4b:bd:90:9a"
        ```
        3.

      Actual results:

      Kustomize plugin error:

      rpc error: code = Unknown desc = Manifest generation error (cached): `kustomize build .demos/ztp-policygen/site-configs --enable-alpha-plugins` failed exit status 1: 2022/05/17 15:18:38 Error: could not unmarshal string# Automatically generated by extra-manifests-builder # Do not make changes directly. apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: 98-var-imageregistry-partition-master labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/sda wipeTable: false partitions: - sizeMiB: 102500 startMiB: 700000 label: var-recovery filesystems: - path: /var/recovery device: /dev/disk/by-partlabel/var-recovery format: xfs systemd: units: - name: var-recovery.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] Where=/var/recovery What=/dev/disk/by-partlabel/var-recovery [Install] WantedBy=local-fs.target # Automatically generated by extra-manifests-builder # Do not make changes directly. apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: 98-var-imageregistry-partition-master labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/sda wipeTable: false partitions: - sizeMiB: 102500 startMiB: 700000 label: var-recovery filesystems: - path: /var/recovery device: /dev/disk/by-partlabel/var-recovery format: xfs systemd: units: - name: var-recovery.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] Where=/var/recovery What=/dev/disk/by-partlabel/var-recovery [Install] WantedBy=local-fs.target # Automatically generated by extra-manifests-builder # Do not make changes directly. apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: 98-var-imageregistry-partition-master labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/sda wipeTable: false partitions: - sizeMiB: 102500 startMiB: 700000 label: var-recovery filesystems: - path: /var/recovery device: /dev/disk/by-partlabel/var-recovery format: xfs systemd: units: - name: var-recovery.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] Where=/var/recovery What=/dev/disk/by-partlabel/var-recovery [Install] WantedBy=local-fs.target ) (yaml: unmarshal errors: line 39: mapping key "apiVersion" already defined at line 3 line 75: mapping key "apiVersion" already defined at line 3 line 40: mapping key "kind" already defined at line 4 line 76: mapping key "kind" already defined at line 4 line 41: mapping key "metadata" already defined at line 5 line 77: mapping key "metadata" already defined at line 5 line 45: mapping key "spec" already defined at line 9 line 81: mapping key "spec" already defined at line 9 line 75: mapping key "apiVersion" already defined at line 39 line 76: mapping key "kind" already defined at line 40 line 77: mapping key "metadata" already defined at line 41 line 81: mapping key "spec" already defined at line 45) 2022/05/17 15:18:38 Error could not create extra-manifest cnf23. yaml: unmarshal errors: line 39: mapping key "apiVersion" already defined at line 3 line 75: mapping key "apiVersion" already defined at line 3 line 40: mapping key "kind" already defined at line 4 line 76: mapping key "kind" already defined at line 4 line 41: mapping key "metadata" already defined at line 5 line 77: mapping key "metadata" already defined at line 5 line 45: mapping key "spec" already defined at line 9 line 81: mapping key "spec" already defined at line 9 line 75: mapping key "apiVersion" already defined at line 39 line 76: mapping key "kind" already defined at line 40 line 77: mapping key "metadata" already defined at line 41 line 81: mapping key "spec" already defined at line 45 2022/05/17 15:18:38 Error: could not build the entire SiteConfig defined by /tmp/kust-plugin-config-1489705013: yaml: unmarshal errors: line 39: mapping key "apiVersion" already defined at line 3 line 75: mapping key "apiVersion" already defined at line 3 line 40: mapping key "kind" already defined at line 4 line 76: mapping key "kind" already defined at line 4 line 41: mapping key "metadata" already defined at line 5 line 77: mapping key "metadata" already defined at line 5 line 45: mapping key "spec" already defined at line 9 line 81: mapping key "spec" already defined at line 9 line 75: mapping key "apiVersion" already defined at line 39 line 76: mapping key "kind" already defined at line 40 line 77: mapping key "metadata" already defined at line 41 line 81: mapping key "spec" already defined at line 45 Error: failure in plugin configured via /tmp/kust-plugin-config-1489705013; exit status 1: exit status 1

      Expected results:

      If diskPartition only targets SNO, then validate the cluster is a SNO cluster

      If diskPartition targets multiple nodes then:

      1) Allow configuring diskPartition per node
      2) If diskPartition configuration must be the same on all nodes, then it probably must be a cluster property instead of a node property.

      Additional info:
      There is a workaround to avoid this problem when dealing with multi-node clusters and disk partitions. Basically, create your own partition on a machineConfig and include it as an extra-manifest object at install time.

      Another workaround in a multi-node set up, is only configure diskPartition in one of the nodes. That config will be applied to all of them which is confusing.

            npathan@redhat.com Nahian Pathan
            openshift-crt-jira-prow OpenShift Prow Bot
            Joshua Clark Joshua Clark
            Red Hat Employee
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: