Description of problem:
After upgrading the cluster from 4.12 to 4.13. Nodes getting booted into emergency mode. Due to error `blockdev: cannot open /dev/dasda2: No such file or directory` From the sosreport collected after successful boot we could see that there were following symlinks setup in /dev/disk/by-label: [sosreport]$ less sos_commands/block/ls_-lanR_.dev [...] /dev/disk/by-label: total 0 drwxr-xr-x. 2 0 0 100 Apr 24 10:09 . drwxr-xr-x. 7 0 0 140 Apr 24 10:09 .. lrwxrwxrwx. 1 0 0 12 Apr 24 10:09 boot -> ../../dasda1 lrwxrwxrwx. 1 0 0 12 Apr 24 10:09 crypt_rootfs -> ../../dasda2 lrwxrwxrwx. 1 0 0 10 Apr 24 10:09 root -> ../../dm-0 <<---------- The command outputs collected from emergency mode, during failed boot process, shows that "root -> ../../dm-0" link was not setup in by-label directory. However /dev/dm-0 device was setup by the time boot process failed: Command outputs from emergency mode: [Console logs]$ less 0200-worker-3-emergency-mode.txt [...] 11:56:41 ls -l /dev/disk/by-label/ 11:56:42 ¬?2004l 11:56:42 total 0 11:56:42 lrwxrwxrwx 1 root root 12 Apr 26 08:11 boot -> ../../dasda1 11:56:42 lrwxrwxrwx 1 root root 12 Apr 26 08:11 crypt_rootfs -> ../../dasda2 <<---------- "root -> ../../dm-0" symlink is missing After multiple retries it gets booted successfully.
Version-Release number of selected component (if applicable):
4.13.36
How reproducible:
NA
Steps to Reproduce:
1. Upgrade cluster to 4.13 2. Check the master and worker node boot 3. Observe the nodes if they booting in emergency mode and collect console logs.
Actual results:
Node went into emergency mode
Expected results:
Node should boot successfully without any issue.
Additional info:
Customer is using Zvm to VM provisioning.
- blocks
-
OCPBUGS-35973 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.
- Closed
- is cloned by
-
OCPBUGS-35973 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.
- Closed
-
OCPBUGS-35988 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.
- Closed
-
OCPBUGS-35989 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.
- Closed
-
OCPBUGS-35990 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.
- Closed
- links to
-
RHEA-2024:3718 OpenShift Container Platform 4.17.z bug fix update