-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.18.z
Description of problem:
Doing an assisted installer deployment (3 baremetal nodes) with multipath enabled since the beginning and wanting to segregate /var in a different partition inside the same root (multipath) disk, the installation fails with error: [ 14.806929] slabnode2219.sl712cluster.slocp.netact.net ignition[3210]: disks: createPartitions: op(1): [started] waiting for devices [/dev/disk/by-id/coreos-boot-disk] [ 104.936043] slabnode2219.sl712cluster.slocp.netact.net systemd[1]: dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device: Job dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device/start timed out. [ 104.936386] slabnode2219.sl712cluster.slocp.netact.net systemd[1]: Timed out waiting for device /dev/disk/by-id/coreos-boot-disk. [ 104.984198] slabnode2219.sl712cluster.slocp.netact.net systemd[1]: dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device: Job dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device/start failed with result 'timeout'.
Version-Release number of selected component (if applicable):
4.18.21
How reproducible:
Always
Steps to Reproduce:
1. Prepare a 3 node baremetal cluster with multipath for deploying it using on-premise assisted-installer 2. Add this manifest to separate the /var partition: ~~~ apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 96-workers-var spec: config: ignition: version: 3.2.0 storage: disks: - device: /dev/disk/by-id/coreos-boot-disk wipe_table: false partitions: - label: root number: 4 resize: true sizeMiB: 10240 - label: var number: 5 sizeMiB: 0 filesystems: - device: /dev/disk/by-partlabel/var path: /var format: xfs label: var systemd: units: - name: var.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] What=/dev/disk/by-partlabel/var Where=/var Options=defaults,prjquota [Install] WantedBy=local-fs.target ~~~ 3. load the iso and start deployment, after the initial reboot of the 2 (non-bootstrap) masters, they will reach the emergency shell
Actual results:
Multipath is correctly initialized during the early boot, also the "/dev/disk/by-id/coreos-boot-disk" points to one of the components of the root's multipath, but for some reason systemd is waiting for the device to appear which seems like an impossible condition. Cluster can be installed without /var segregation, but we cannot include the manifest or the installation wont succeed.
Expected results:
Even though "/dev/disk/by-id/coreos-boot-disk" points to one of the components instead of the "mpath" device, the partitioning is made and the boot progresses.
Additional info:
We have tested the manifest in labs with only local disk and it works flawlessly. Also we have tested all the whole installation without the /var segregation and the installation completes successfully. But the combination of /var segregation and multipath is consistently failing.