Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: None
Affects Version/s: 4.13.z
Component/s: RHCOS
Labels:
- FastFix
- Node
- OCP-4.13
- RHCOS
- osintegration

Severity:
Critical
Regression:
No
Story Points:
5
Sprint:
254 - Integration & Delivery, 255 - Integration & Delivery
sprint_count:
2
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:

Hide
Previously, a bug in the `growpart` utility caused a LUKS device to become locked and unable to open. This prevented the system from booting and entering into an emergency mode. With this release, the call to the `growpart` utility is removed and the system successfully boots without issue. (link:https://issues.redhat.com/browse/OCPBUGS-33124[*~~OCPBUGS-33124~~*])

Show
Previously, a bug in the `growpart` utility caused a LUKS device to become locked and unable to open. This prevented the system from booting and entering into an emergency mode. With this release, the call to the `growpart` utility is removed and the system successfully boots without issue. (link: https://issues.redhat.com/browse/OCPBUGS-33124 [* OCPBUGS-33124 *])
Release Note Type:
Bug Fix
Release Note Status:
Done
Target Version:

4.17.0
Target Backport Versions:

4.13.z, 4.14.z, 4.15.z, 4.16.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:

Description of problem:

After upgrading the cluster from 4.12 to 4.13. Nodes getting booted into emergency mode. Due to error 
`blockdev: cannot open /dev/dasda2: No such file or directory` 

From the sosreport collected after successful boot we could see that there were following symlinks setup in /dev/disk/by-label:	


	[sosreport]$ less sos_commands/block/ls_-lanR_.dev 
	[...]
	/dev/disk/by-label:
	total 0
	drwxr-xr-x. 2 0 0 100 Apr 24 10:09 .
	drwxr-xr-x. 7 0 0 140 Apr 24 10:09 ..
	lrwxrwxrwx. 1 0 0  12 Apr 24 10:09 boot -> ../../dasda1
	lrwxrwxrwx. 1 0 0  12 Apr 24 10:09 crypt_rootfs -> ../../dasda2
	lrwxrwxrwx. 1 0 0  10 Apr 24 10:09 root -> ../../dm-0                   <<----------


The command outputs collected from emergency mode, during failed boot process, shows that   "root -> ../../dm-0" link was not setup in by-label directory. However /dev/dm-0 device was setup by the time boot process failed:

Command outputs from emergency mode:


	[Console logs]$ less 0200-worker-3-emergency-mode.txt 
	[...]
	11:56:41 ls -l /dev/disk/by-label/                                              
	11:56:42  ¬?2004l 11:56:42 total 0                                              
	11:56:42 lrwxrwxrwx 1 root root 12 Apr 26 08:11 boot -> ../../dasda1            
	11:56:42 lrwxrwxrwx 1 root root 12 Apr 26 08:11 crypt_rootfs -> ../../dasda2    		<<---------- "root -> ../../dm-0" symlink is missing


After multiple retries it gets booted successfully.

Version-Release number of selected component (if applicable):

4.13.36

How reproducible:

NA

Steps to Reproduce:

    1. Upgrade cluster to 4.13
    2. Check the master and worker node boot
    3. Observe the nodes if they booting in emergency mode and collect console logs.

Actual results:

Node went into emergency mode

Expected results:

Node should boot successfully without any issue.

Additional info:

Customer is using Zvm to VM provisioning.

blocks

OCPBUGS-35973 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.

Closed

is cloned by

OCPBUGS-35973 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.

Closed

OCPBUGS-35988 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.

Closed

OCPBUGS-35989 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.

Closed

OCPBUGS-35990 After upgrading to 4.13 from 4.12 one of the worker node went into emergency mode.

Closed

links to

https://access.redhat.com/solutions/7072328

openshift/os#1522: OCPBUGS-33124: coreos-cryptfs: drop growpart call

RHEA-2024:3718 OpenShift Container Platform 4.17.z bug fix update

(3 links to)

Assignee:: Madhu Pillai

Reporter:: Sourav Jain

QA Contact:: Michael Nguyen

Votes:: 0 Vote for this issue

Watchers:: 17 Start watching this issue

Created:: 2024/04/30 9:43 AM

Updated:: 2024/10/01 5:34 PM

Resolved:: 2024/10/01 5:34 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates