Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9272

SiteConfig disk partition definition fails when applied to multiple nodes in a cluster

    XMLWordPrintable

Details

    • Important
    • CNF RAN Sprint 234, CNF RAN Sprint 235, CNF RAN Sprint 236
    • 3
    • Rejected
    • Unspecified
    • Release Note
    • Hide
      03/16: this should not gate 4.13 but should be release noted
      Rel Note for Telco: Yes
      Show
      03/16: this should not gate 4.13 but should be release noted Rel Note for Telco: Yes

    Description

      Description of problem:
      Considering that diskPartition is a node setting, the disk partition definition fails when more than one node is configured with a diskPartition in a cluster.

      In my tests I see:

      1. If diskPartition only targets SNOs there is no check/validation that the user is creating a SNO cluster
      2. If diskPartition is expected to target more than one node in a compact cluster which is reasonable (e.g. TALO recovery partition, mount point for etcd or container storage) the policygentools fails to render.
      3. If diskPartition is expected to target more than one node, you can still do it by only configuring diskPartition of one node of a cluster. The partitioning will be applied to all of them, at least on a compact cluster. In that case, I expect the diskPartition setting to be a cluster setting instead of a node setting.

      Version-Release number of selected component (if applicable):
      4.10

      How reproducible:
      Always. Create a simple diskPartition configuration in a multi/compact cluster configuration using a siteConfig.

      Steps to Reproduce:
      1. Create a siteConfig to provisioning a compact cluster, e.g. 3 nodes
      2. Create a valid diskPartition config in each of the nodes, for instance:

      #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
      diskPartition:

      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000

      ```
      nodes:

      • hostName: "master-00.cnf23.e2e.bos.redhat.com"
        role: "master"
        bmcAddress: "idrac-virtualmedia+https://10.19.28.53/redfish/v1/Systems/System.Embedded.1"
        bmcCredentialsName:
        name: "worker0-bmh-secret"
        bootMACAddress: "e4:43:4b:bd:90:46"
        bootMode: "UEFI"
        rootDeviceHints:
        hctl: "0:2:0:0"
        deviceName: /dev/sda
        #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
        diskPartition:
      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000
        nodeNetwork:
        config:
        interfaces:
      • name: ens1f0
        type: ethernet
        state: up
        macAddress: "e4:43:4b:bd:90:46"
        ipv4:
        enabled: true
        dhcp: true
        auto-dns: false
        ipv6:
        enabled: false
        dns-resolver:
        config:
        server:
      • 10.19.143.247
        interfaces:
      • name: "ens1f0"
        macAddress: "e4:43:4b:bd:90:46"
      • hostName: "master-01.cnf23.e2e.bos.redhat.com"
        role: "master"
        bmcAddress: "idrac-virtualmedia+https://10.19.28.54/redfish/v1/Systems/System.Embedded.1"
        bmcCredentialsName:
        name: "worker1-bmh-secret"
        bootMACAddress: "e4:43:4b:bd:92:b8"
        bootMode: "UEFI"
        rootDeviceHints:
        hctl: "0:2:0:0"
        deviceName: /dev/sda
        #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
        diskPartition:
      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000
        nodeNetwork:
        config:
        interfaces:
      • name: ens1f0
        type: ethernet
        state: up
        macAddress: "e4:43:4b:bd:92:b8"
        ipv4:
        enabled: true
        dhcp: true
        auto-dns: false
        ipv6:
        enabled: false
        dns-resolver:
        config:
        server:
      • 10.19.143.247
        interfaces:
      • name: "ens1f0"
        macAddress: "e4:43:4b:bd:92:b8"
      • hostName: "master-02.cnf23.e2e.bos.redhat.com"
        role: "master"
        bmcAddress: "idrac-virtualmedia+https://10.19.28.55/redfish/v1/Systems/System.Embedded.1"
        bmcCredentialsName:
        name: "worker2-bmh-secret"
        bootMACAddress: "e4:43:4b:bd:90:9a"
        bootMode: "UEFI"
        rootDeviceHints:
        hctl: "0:2:0:0"
        #deviceName: /dev/sda
        #Disk /dev/sda: 893.3 GiB, 959119884288 bytes, 1873281024 sectors
        diskPartition:
      • device: /dev/sda
        partitions:
      • mount_point: /var/recovery
        size: 102500
        start: 700000
        nodeNetwork:
        config:
        interfaces:
      • name: ens1f0
        type: ethernet
        state: up
        macAddress: "e4:43:4b:bd:90:9a"
        ipv4:
        enabled: true
        dhcp: true
        auto-dns: false
        ipv6:
        enabled: false
        dns-resolver:
        config:
        server:
      • 10.19.143.247
        interfaces:
      • name: "ens1f0"
        macAddress: "e4:43:4b:bd:90:9a"
        ```
        3.

      Actual results:

      Kustomize plugin error:

      rpc error: code = Unknown desc = Manifest generation error (cached): `kustomize build .demos/ztp-policygen/site-configs --enable-alpha-plugins` failed exit status 1: 2022/05/17 15:18:38 Error: could not unmarshal string# Automatically generated by extra-manifests-builder # Do not make changes directly. apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: 98-var-imageregistry-partition-master labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/sda wipeTable: false partitions: - sizeMiB: 102500 startMiB: 700000 label: var-recovery filesystems: - path: /var/recovery device: /dev/disk/by-partlabel/var-recovery format: xfs systemd: units: - name: var-recovery.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] Where=/var/recovery What=/dev/disk/by-partlabel/var-recovery [Install] WantedBy=local-fs.target # Automatically generated by extra-manifests-builder # Do not make changes directly. apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: 98-var-imageregistry-partition-master labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/sda wipeTable: false partitions: - sizeMiB: 102500 startMiB: 700000 label: var-recovery filesystems: - path: /var/recovery device: /dev/disk/by-partlabel/var-recovery format: xfs systemd: units: - name: var-recovery.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] Where=/var/recovery What=/dev/disk/by-partlabel/var-recovery [Install] WantedBy=local-fs.target # Automatically generated by extra-manifests-builder # Do not make changes directly. apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: 98-var-imageregistry-partition-master labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/sda wipeTable: false partitions: - sizeMiB: 102500 startMiB: 700000 label: var-recovery filesystems: - path: /var/recovery device: /dev/disk/by-partlabel/var-recovery format: xfs systemd: units: - name: var-recovery.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] Where=/var/recovery What=/dev/disk/by-partlabel/var-recovery [Install] WantedBy=local-fs.target ) (yaml: unmarshal errors: line 39: mapping key "apiVersion" already defined at line 3 line 75: mapping key "apiVersion" already defined at line 3 line 40: mapping key "kind" already defined at line 4 line 76: mapping key "kind" already defined at line 4 line 41: mapping key "metadata" already defined at line 5 line 77: mapping key "metadata" already defined at line 5 line 45: mapping key "spec" already defined at line 9 line 81: mapping key "spec" already defined at line 9 line 75: mapping key "apiVersion" already defined at line 39 line 76: mapping key "kind" already defined at line 40 line 77: mapping key "metadata" already defined at line 41 line 81: mapping key "spec" already defined at line 45) 2022/05/17 15:18:38 Error could not create extra-manifest cnf23. yaml: unmarshal errors: line 39: mapping key "apiVersion" already defined at line 3 line 75: mapping key "apiVersion" already defined at line 3 line 40: mapping key "kind" already defined at line 4 line 76: mapping key "kind" already defined at line 4 line 41: mapping key "metadata" already defined at line 5 line 77: mapping key "metadata" already defined at line 5 line 45: mapping key "spec" already defined at line 9 line 81: mapping key "spec" already defined at line 9 line 75: mapping key "apiVersion" already defined at line 39 line 76: mapping key "kind" already defined at line 40 line 77: mapping key "metadata" already defined at line 41 line 81: mapping key "spec" already defined at line 45 2022/05/17 15:18:38 Error: could not build the entire SiteConfig defined by /tmp/kust-plugin-config-1489705013: yaml: unmarshal errors: line 39: mapping key "apiVersion" already defined at line 3 line 75: mapping key "apiVersion" already defined at line 3 line 40: mapping key "kind" already defined at line 4 line 76: mapping key "kind" already defined at line 4 line 41: mapping key "metadata" already defined at line 5 line 77: mapping key "metadata" already defined at line 5 line 45: mapping key "spec" already defined at line 9 line 81: mapping key "spec" already defined at line 9 line 75: mapping key "apiVersion" already defined at line 39 line 76: mapping key "kind" already defined at line 40 line 77: mapping key "metadata" already defined at line 41 line 81: mapping key "spec" already defined at line 45 Error: failure in plugin configured via /tmp/kust-plugin-config-1489705013; exit status 1: exit status 1

      Expected results:

      If diskPartition only targets SNO, then validate the cluster is a SNO cluster

      If diskPartition targets multiple nodes then:

      1) Allow configuring diskPartition per node
      2) If diskPartition configuration must be the same on all nodes, then it probably must be a cluster property instead of a node property.

      Additional info:
      There is a workaround to avoid this problem when dealing with multi-node clusters and disk partitions. Basically, create your own partition on a machineConfig and include it as an extra-manifest object at install time.

      Another workaround in a multi-node set up, is only configure diskPartition in one of the nodes. That config will be applied to all of them which is confusing.

      Attachments

        Activity

          People

            priysing@redhat.com Priyanka Singh
            alosadag@redhat.com Alberto Losada
            Joshua Clark Joshua Clark
            Red Hat Employee
            Votes:
            1 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated: