-
Bug
-
Resolution: Unresolved
-
Critical
-
4.21.0
Description of problem:
Doing an assisted installer deployment (3 baremetal nodes) with multipath enabled since the beginning and wanting to segregate /var in a different partition inside the same root (multipath) disk, the installation fails with error: [ 14.806929] slabnode2219.sl712cluster.slocp.netact.net ignition[3210]: disks: createPartitions: op(1): [started] waiting for devices [/dev/disk/by-id/coreos-boot-disk] [ 104.936043] slabnode2219.sl712cluster.slocp.netact.net systemd[1]: dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device: Job dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device/start timed out. [ 104.936386] slabnode2219.sl712cluster.slocp.netact.net systemd[1]: Timed out waiting for device /dev/disk/by-id/coreos-boot-disk. [ 104.984198] slabnode2219.sl712cluster.slocp.netact.net systemd[1]: dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device: Job dev-disk-by\x2did-coreos\x2dboot\x2ddisk.device/start failed with result 'timeout'.
Version-Release number of selected component (if applicable):
4.18.21
How reproducible:
Always
Steps to Reproduce:
1. Prepare a 3 node baremetal cluster with multipath for deploying it using on-premise assisted-installer
2. Add this manifest to separate the /var partition:
~~~
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 96-workers-var
spec:
config:
ignition:
version: 3.2.0
storage:
disks:
- device: /dev/disk/by-id/coreos-boot-disk
wipe_table: false
partitions:
- label: root
number: 4
resize: true
sizeMiB: 10240
- label: var
number: 5
sizeMiB: 0
filesystems:
- device: /dev/disk/by-partlabel/var
path: /var
format: xfs
label: var
systemd:
units:
- name: var.mount
enabled: true
contents: |
[Unit]
Before=local-fs.target
[Mount]
What=/dev/disk/by-partlabel/var
Where=/var
Options=defaults,prjquota
[Install]
WantedBy=local-fs.target
~~~
3. load the iso and start deployment, after the initial reboot of the 2 (non-bootstrap) masters, they will reach the emergency shell
Actual results:
Multipath is correctly initialized during the early boot, also the "/dev/disk/by-id/coreos-boot-disk" points to one of the components of the root's multipath, but for some reason systemd is waiting for the device to appear which seems like an impossible condition. Cluster can be installed without /var segregation, but we cannot include the manifest or the installation wont succeed.
Expected results:
Even though "/dev/disk/by-id/coreos-boot-disk" points to one of the components instead of the "mpath" device, the partitioning is made and the boot progresses.
Additional info:
We have tested the manifest in labs with only local disk and it works flawlessly. Also we have tested all the whole installation without the /var segregation and the installation completes successfully. But the combination of /var segregation and multipath is consistently failing.
- blocks
-
OCPBUGS-64611 [4.20] [OCP 4.18] coreos-boot-disk link not working with multipath on early boot
-
- ASSIGNED
-
- is blocked by
-
OCPBUGS-65585 [4.21] Bootimage bump tracker
-
- Verified
-
- is cloned by
-
OCPBUGS-64611 [4.20] [OCP 4.18] coreos-boot-disk link not working with multipath on early boot
-
- ASSIGNED
-
- relates to
-
RHEL-127550 [9.6.z] Bump ignition release version
-
- Release Pending
-
-
RHEL-125290 [9.6.z] Backport Ignition fix related to device mapper partitioning
-
- Closed
-
- links to