Loading...

XML

Word

Printable

Type: Bug
Resolution: Won't Do
Priority: Undefined
Fix Version/s: None
Affects Version/s: odf-4.18
Component/s: unclassified
Labels:
- telco-pre-ga-4.18

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Dev Approval:
?
Docs Approval:
?
PM Approval:
?
QE Approval:
?
Target Release:

odf-4.19
Intelligence Requested:
Market:
RH Private Keywords:

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem:

In ODF 4.18 we have identified two different issues related to the cleanup of the CEPH metadata in the OSD disks.

First issue:

Annotation "uninstall.ocs.openshift.io/cleanup-policy: delete" is not being honored. After deleting the StorageSystem (and as such the StorageCluster), CEPH metadata is not getting removed. We can connect to the node and run this command to verify metadata is still there:

[root@openshift-ctlplane-0 ~]# podman run --rm -ti --privileged --device /dev/vdb --entrypoint ceph-volume quay.io/ceph/ceph:v19 raw list /dev/vdb --format json{    "3460d04a-0dbc-40a8-9947-ac8192b65d77": {        "ceph_fsid": "b392dd9e-be7a-4e6e-bb24-78a962bb70be",        "device": "/dev/vdb",        "osd_id": 1,        "osd_uuid": "3460d04a-0dbc-40a8-9947-ac8192b65d77",        "type": "bluestore"    }}


Second issue:

Manual cleanup of CEPH Metadata following upstream docs[1] is not cleaning CEPH metadata:

[root@openshift-ctlplane-0 ~]# wipefs -a -f /dev/vdb
[root@openshift-ctlplane-0 ~]# echo $?
0
[root@openshift-ctlplane-0 ~]# podman run --rm -ti --privileged --device /dev/vdb --entrypoint ceph-volume quay.io/ceph/ceph:v19 raw list /dev/vdb --format json
{    "3460d04a-0dbc-40a8-9947-ac8192b65d77": {
        "ceph_fsid": "b392dd9e-be7a-4e6e-bb24-78a962bb70be",
        "device": "/dev/vdb",
        "osd_id": 1,        
"osd_uuid": "3460d04a-0dbc-40a8-9947-ac8192b65d77",
        "type": "bluestore"    }
}

[root@openshift-ctlplane-0 ~]# DISK="/dev/vdb"
[root@openshift-ctlplane-0 ~]# sgdisk --zap-all $DISK
Creating new GPT entries in memory.GPT data structures destroyed! You may now partition the disk using fdisk orother utilities.
[root@openshift-ctlplane-0 ~]# dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0795897 s, 1.3 GB/s
[root@openshift-ctlplane-0 ~]# blkdiscard $DISK
[root@openshift-ctlplane-0 ~]# podman run --rm -ti --privileged --device /dev/vdb --entrypoint ceph-volume quay.io/ceph/ceph:v19 raw list /dev/vdb --format json
{    "3460d04a-0dbc-40a8-9947-ac8192b65d77": {
        "ceph_fsid": "b392dd9e-be7a-4e6e-bb24-78a962bb70be",
        "device": "/dev/vdb",
        "osd_id": 1,        
"osd_uuid": "3460d04a-0dbc-40a8-9947-ac8192b65d77",
        "type": "bluestore"    }
}

In our case it takes around ~15GB of zeroes to clean the metadata:

[root@openshift-ctlplane-0 ~]# dd if=/dev/zero of=/dev/vdb bs=1G count=15 oflag=direct,dsync status=progress
16106127360 bytes (16 GB, 15 GiB) copied, 30 s, 542 MB/s 
15+0 records in
15+0 records out
16106127360 bytes (16 GB, 15 GiB) copied, 29.7043 s, 542 MB/s
[root@openshift-ctlplane-0 ~]# podman run --rm -ti --privileged --device /dev/vdb --entrypoint ceph-volume quay.io/ceph/ceph:v19 raw list /dev/vdb --format json
{}

[1] https://rook.io/docs/rook/v1.14/Getting-Started/ceph-teardown/#zapping-devices

We have this document with further information and reproducible steps: https://docs.google.com/document/d/1HBej5PCPpFibynlrnJHt12rgctr90WUk7EEj_Kkmgq4/edit?tab=t.0

Version-Release number of selected component (if applicable):

v4.18.0-112.stable

How reproducible:

Always

Steps to Reproduce:

Described above and in linked document.

Actual results:

OSD disk is not cleaned and OSD prepare fails

Expected results:

OSD disk is cleaned and OSD prepare succeeds

Additional info:

We have an environment where this can be reproduced if required.

More info: https://docs.google.com/document/d/1HBej5PCPpFibynlrnJHt12rgctr90WUk7EEj_Kkmgq4/edit?tab=t.0

relates to

ODFRFE-19 Support for the ODF Operator to cleanup ceph bluestore metadata from OSD disks before deploying the cluster

Closed

Assignee:: Santosh Pillai

Reporter:: Mario Vazquez Cebrian

Contributors:: Federico Ferrando

QA Contact:: Wei Duan

Votes:: 0 Vote for this issue

Watchers:: 28 Start watching this issue

Created:: 2025/02/19 3:41 PM

Updated:: 2025/10/17 5:06 PM

Resolved:: 2025/04/08 10:11 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty