Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Critical
Fix Version/s: None
Affects Version/s: odf-4.12
Component/s: Documentation
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Bugzilla Bug:
RHBZ: 2217568
Dev Approval:
?
QE Approval:
?
Release Note Type:
If docs needed, set a value
Target Release:

odf-4.14.z
Intelligence Requested:
Market:

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem (please be detailed as possible and provide log
snippests):

After cluster wide reboot on cert auth. A ODF node reboot removed the DASD partition and they lost all 3 OSDs.

Customer followed this IBM documentation to partition the DASD.

> https://www.ibm.com/docs/en/linux-on-systems?topic=architecture-storage
> See Section “4.1.2 Steps specific for DASD devices”

ODF deployed successfully with LSO and the OSDs mapped to dasde1.

To use host binaries, run `chroot /host`
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop1 7:1 0 811.6G 0 loop
dasda 94:0 0 103.2G 0 disk

-dasda1 94:1 0 384M 0 part /host/boot
`-dasda2 94:2 0 102.8G 0 part /host/sysroot
dasde 94:16 0 811.6G 0 disk
`-dasde1 94:17 0 811.6G 0 part

After cluster wide reboot on cert auth.

MapVolume.EvalHostSymlinks failed for volume "local-pv-ef04e88d" : lstat /dev/disk/by-id/ccw-IBM.750000000KHF61.baee.40-part1: no such file or directory

Events log:

3m27s Warning FailedMapVolume pod/rook-ceph-osd-0-59c9db848-5rp9f MapVolume.EvalHostSymlinks failed for volume "local-pv-ef04e88d" : lstat /dev/disk/by-id/ccw-IBM.750000000KHF61.baee.40-part1: no such file or directory
23m Warning FailedMount pod/rook-ceph-osd-0-59c9db848-5rp9f (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[ocs-deviceset-odf-cluster-storage-0-data-1wgxd7], unattached volumes=[ocs-deviceset-odf-cluster-storage-0-data-1wgxd7 ocs-deviceset-odf-cluster-storage-0-data-1wgxd7-bridge kube-api-access-25kpz rook-data rook-config-override rook-ceph-log rook-ceph-crash run-udev]: timed out waiting for the condition

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
dasda 94:0 0 103.2G 0 disk

-dasda1 94:1 0 384M 0 part /boot
`-dasda2 94:2 0 102.8G 0 part /sysroot
dasde 94:16 0 811.6G 0 disk

Version of all relevant components (if applicable):

OCP/ODF 4.12

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Node reboot destroys the OSD path to dasde1

Is there any workaround available to the best of your knowledge?

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

Can this issue reproducible?

Yes, on reboot of the node.

Can this issue reproduce from the UI?

If this is a regression, please provide more details to justify this:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

DSAD partition persists on reboot.

Additional info:

external trackers

Red Hat Customer Portal 03539410

Assignee:: Anjana Sriram

Reporter:: Kevan Hover

Need Info From:: Kevan Hover, Santosh Pillai

QA Contact:: Neha Berry

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Created:: 2023/06/26 5:57 PM

Updated:: 2024/11/29 1:25 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty