-
Bug
-
Resolution: Done
-
Major
-
None
-
4.12
-
Important
-
No
-
5
-
OSDOCS Sprint 261, OSDOCS Sprint 262
-
2
-
False
-
Description of problem:
Customer would like to add a second disk device into RHCOS nodes to mount '/var/lib/containers' by following steps in article https://access.redhat.com/solutions/4952011
This other device is a multipath one, with LVM on top, and they are trying to use it as a separate filesystem to mount '/var/lib/containers'
It fails with LVM timing out on job start
Oct 04 20:55:03 masr8c3locp2w5.corp.du.ae systemd[1]: Found device /dev/mapper/container-vol. Oct 04 20:55:03 masr8c3locp2w5.corp.du.ae systemd[1]: Started LVM event activation on device 253:0. ..... Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: dev-mapper-container-vol.device: Job dev-mapper-container-vol.device/start timed out. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: Timed out waiting for device dev-mapper-container-vol.device. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: Dependency failed for Make File System on /dev/mapper/container-vol. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: Dependency failed for Mount /dev/mapper/container-vol to /var/lib/containers. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: Dependency failed for CRI-O Auto Update Script. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: crio-wipe.service: Job crio-wipe.service/start failed with result 'dependency'. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: var-lib-containers.mount: Job var-lib-containers.mount/start failed with result 'dependency'. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: systemd-mkfs@dev-mapper-container-vol.service: Job systemd-mkfs@dev-mapper-container-vol.service/start failed with result 'dependency'. Oct 04 20:56:25 masr8c3locp2w5.corp.du.ae systemd[1]: dev-mapper-container-vol.device: Job dev-mapper-container-vol.device/start failed with result 'timeout'.
This lvm is on top of mpath devices.
sdc 8:32 0 2T 0 disk └─mpatha 253:0 0 2T 0 mpath └─container-vol 253:1 0 2T 0 lvm sdd 8:48 0 2T 0 disk └─mpatha 253:0 0 2T 0 mpath └─container-vol 253:1 0 2T 0 lvm sde 8:64 0 2T 0 disk └─mpatha 253:0 0 2T 0 mpath └─container-vol 253:1 0 2T 0 lvm sdf 8:80 0 2T 0 disk └─mpatha 253:0 0 2T 0 mpath └─container-vol 253:1 0 2T 0 lvm
My understanding is that customer can use the steps in documentation to enable multipath during node install here: https://docs.openshift.com/container-platform/4.12/installing/installing_bare_metal/installing-bare-metal.html#rhcos-enabling-multipath_installing-bare-metal and specify the the WWN for the /dev section to avoid multi-naming issue with mpathX devices.
Still, the link isn't completely clear to me and I have a few questions:
- Where these commands (mpathconf & coreos-installer) must be executed? Is it inside the ISO installer, meaning that customer needs to boot the system with the ISO ?
- If it is inside the ISO installer, last steps in docs should point to reboot the system right? My understanding is for it to take the ignition config and join the cluster
- From where does it takes the ignition config to join the cluster ? I don't see the --ignition-url mentioned in the coreos-installer command while specifying the multipath device where the OS will be installed.
- How, if possible, one would define 2 different devices at installation time for RHCOS? main device will be local as it is now in customer's node and second device will be multipath device to mount on '/var/lib/container'
Version-Release number of selected component (if applicable):
- OCP 4.12
How reproducible:
- All the time on customer's environment
Steps to Reproduce:
1. Create a MachineConfig and declare a LVM device to be mounted in /var/lib/containers following steps from https://access.redhat.com/solutions/4952011
2. Apply MC to a node
Actual results:
- Node doesn't start properly with LVM timeout not mounting /var/lib/containers in a separate storage device
Expected results:
- MachineConfig to be applied on node and mount the separate storage device (multipath + LVM) into /var/lib/containers
Additional info:
- MachineConfig spec:
spec: config: ignition: version: 3.2.0 systemd: units: - contents: | [Unit] Description=Make File System on /dev/mapper/container-vol DefaultDependencies=no BindsTo=dev-mapper-container-vol.device After=dev-mapper-container-vol.device var.mount Before=systemd-fsck@dev-mapper-container-vol.service [Service] Type=oneshot RemainAfterExit=yes ExecStart=-/bin/bash -c "/bin/rm -rf /var/lib/containers/*" ExecStart=/usr/lib/systemd/systemd-makefs xfs /dev/mapper/container-vol TimeoutSec=0 [Install] WantedBy=var-lib-containers.mount enabled: true name: systemd-mkfs@dev-mapper-container-vol.service - contents: | [Unit] Description=Mount /dev/mapper/container-vol to /var/lib/containers Before=local-fs.target Requires=systemd-mkfs@dev-mapper-container-vol.service After=systemd-mkfs@dev-mapper-container-vol.service [Mount] What=/dev/mapper/container-vol Where=/var/lib/containers Type=xfs Options=defaults,prjquota [Install] WantedBy=local-fs.target enabled: true name: var-lib-containers.mount - contents: | [Unit] Description=Restore recursive SELinux security contexts DefaultDependencies=no After=var-lib-containers.mount Before=crio.service [Service] Type=oneshot RemainAfterExit=yes ExecStart=/sbin/restorecon -R /var/lib/containers/ TimeoutSec=0 [Install] WantedBy=multi-user.target graphical.target enabled: true name: restorecon-var-lib-containers.service
- links to