-
Bug
-
Resolution: Unresolved
-
Critical
-
odf-4.17
-
None
Description of problem (please be detailed as possible and provide log
snippests):
Uninstall ODF operator with operands fail on vSphere cluster.
Cluster is fresh and no loads were running.
Storagecluster stacks in Deleting indefinitely.
oc get storagecluster -w
NAME AGE PHASE EXTERNAL CREATED AT VERSION
ocs-storagecluster 14h Deleting 2024-09-25T16:50:52Z 4.17.0
oc describe storagecluster ocs-storagecluster
Name: ocs-storagecluster
Namespace: openshift-storage
Labels: <none>
Annotations: uninstall.ocs.openshift.io/cleanup-policy: delete
uninstall.ocs.openshift.io/mode: graceful
API Version: ocs.openshift.io/v1
Kind: StorageCluster
Metadata:
Creation Timestamp: 2024-09-25T16:50:52Z
Deletion Grace Period Seconds: 0
Deletion Timestamp: 2024-09-26T07:14:00Z
Finalizers:
storagecluster.ocs.openshift.io
Generation: 5
Owner References:
API Version: odf.openshift.io/v1alpha1
Kind: StorageSystem
Name: ocs-storagecluster-storagesystem
UID: fa1694f6-a1b2-4f88-b047-310cd786d496
Resource Version: 521474
UID: fa516830-1274-4416-bd0b-31d8c12ef1d6
Spec:
Arbiter:
Enable Ceph Tools: true
Encryption:
Key Rotation:
Schedule: @weekly
Kms:
External Storage:
Managed Resources:
Ceph Block Pools:
Ceph Cluster:
Ceph Config:
Ceph Dashboard:
Ceph Filesystems:
Data Pool Spec:
Application:
Erasure Coded:
Coding Chunks: 0
Data Chunks: 0
Mirroring:
Quotas:
Replicated:
Size: 0
Status Check:
Mirror:
Ceph Non Resilient Pools:
Count: 1
Resources:
Volume Claim Template:
Metadata:
Spec:
Resources:
Status:
Ceph Object Store Users:
Ceph Object Stores:
Ceph RBD Mirror:
Daemon Count: 1
Ceph Toolbox:
Mirroring:
Storage Device Sets:
Config:
Count: 1
Data PVC Template:
Metadata:
Spec:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 100Gi
Storage Class Name: thin-csi-odf
Volume Mode: Block
Status:
Name: ocs-deviceset
Placement:
Portable: true
Prepare Placement:
Replica: 3
Resources:
Status:
Conditions:
Last Heartbeat Time: 2024-09-25T16:50:53Z
Last Transition Time: 2024-09-25T16:50:53Z
Message: Version check successful
Reason: VersionMatched
Status: False
Type: VersionMismatch
Last Heartbeat Time: 2024-09-26T07:13:48Z
Last Transition Time: 2024-09-25T16:54:25Z
Message: Reconcile completed successfully
Reason: ReconcileCompleted
Status: True
Type: ReconcileComplete
Last Heartbeat Time: 2024-09-26T07:13:48Z
Last Transition Time: 2024-09-25T16:55:18Z
Message: Reconcile completed successfully
Reason: ReconcileCompleted
Status: True
Type: Available
Last Heartbeat Time: 2024-09-26T07:13:48Z
Last Transition Time: 2024-09-25T16:55:18Z
Message: Reconcile completed successfully
Reason: ReconcileCompleted
Status: False
Type: Progressing
Last Heartbeat Time: 2024-09-26T07:13:48Z
Last Transition Time: 2024-09-25T16:50:53Z
Message: Reconcile completed successfully
Reason: ReconcileCompleted
Status: False
Type: Degraded
Last Heartbeat Time: 2024-09-26T07:13:48Z
Last Transition Time: 2024-09-25T16:55:18Z
Message: Reconcile completed successfully
Reason: ReconcileCompleted
Status: True
Type: Upgradeable
Current Mon Count: 3
Default Ceph Device Class: ssd
Failure Domain: rack
Failure Domain Key: topology.rook.io/rack
Failure Domain Values:
rack0
rack1
rack2
Images:
Ceph:
Actual Image: registry.redhat.io/rhceph/rhceph-7-rhel9@sha256:75bd8969ab3f86f2203a1ceb187876f44e54c9ee3b917518c4d696cf6cd88ce3
Desired Image: registry.redhat.io/rhceph/rhceph-7-rhel9@sha256:75bd8969ab3f86f2203a1ceb187876f44e54c9ee3b917518c4d696cf6cd88ce3
Noobaa Core:
Actual Image: registry.redhat.io/odf4/mcg-core-rhel9@sha256:36b0f91c9f82be61310a36a8dca20a340e92318b1bc98fee42e48369bcf51fb8
Desired Image: registry.redhat.io/odf4/mcg-core-rhel9@sha256:36b0f91c9f82be61310a36a8dca20a340e92318b1bc98fee42e48369bcf51fb8
Noobaa DB:
Actual Image: registry.redhat.io/rhel9/postgresql-15@sha256:10475924583aba63c50b00337b4097cdad48e7a8567fd77f400b278ad1782bfc
Desired Image: registry.redhat.io/rhel9/postgresql-15@sha256:10475924583aba63c50b00337b4097cdad48e7a8567fd77f400b278ad1782bfc
Kms Server Connection:
Node Topologies:
Labels:
kubernetes.io/hostname:
compute-0
compute-1
compute-2
topology.rook.io/rack:
rack0
rack1
rack2
Phase: Deleting
Related Objects:
API Version: ceph.rook.io/v1
Kind: CephCluster
Name: ocs-storagecluster-cephcluster
Namespace: openshift-storage
Resource Version: 521342
UID: c8353689-af9b-4135-b5bb-0825dbb8dbe1
API Version: noobaa.io/v1alpha1
Kind: NooBaa
Name: noobaa
Namespace: openshift-storage
Resource Version: 521276
UID: a63054ce-f16c-425c-8e53-d41f84d83919
Version: 4.17.0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning UninstallPending 31m controller_storagecluster uninstall: Waiting on NooBaa system noobaa to be deleted
Warning UninstallPending 31m controller_storagecluster Uninstall: Waiting for Ceph RGW Route ocs-storagecluster-cephobjectstore to be deleted
Warning UninstallPending 31m controller_storagecluster Uninstall: Waiting for Ceph RGW Route ocs-storagecluster-cephobjectstore-secure to be deleted
Warning UninstallPending 31m controller_storagecluster uninstall: Waiting for CephObjectStoreUser ocs-storagecluster-cephobjectstoreuser to be deleted
Warning UninstallPending 31m controller_storagecluster uninstall: Waiting for CephObjectStoreUser prometheus-user to be deleted
Warning UninstallPending 31m controller_storagecluster uninstall: Waiting for CephObjectStore ocs-storagecluster-cephobjectstore to be deleted
Warning UninstallPending 31m controller_storagecluster uninstall: Waiting for subvolumegroup ocs-storagecluster-cephfilesystem-csi to be deleted
Warning UninstallPending 31m controller_storagecluster uninstall: Waiting for CephFileSystem ocs-storagecluster-cephfilesystem to be deleted
Warning UninstallPending 31m controller_storagecluster uninstall: Waiting for CephCluster to be deleted
oc logs odf-operator-controller-manager-66dd779d8b-cpfhv
2024-09-26T07:35:52Z INFO controllers.StorageSystem Waiting for deletion {"instance":
{"name":"ocs-storagecluster-storagesystem","namespace":"openshift-storage"}, "Kind": "storagecluster.ocs.openshift.io/v1", "Name": "ocs-storagecluster"}
2024-09-26T07:35:52Z ERROR Reconciler error {"controller": "storagesystem", "controllerGroup": "odf.openshift.io", "controllerKind": "StorageSystem", "StorageSystem":
, "namespace": "openshift-storage", "name": "ocs-storagecluster-storagesystem", "reconcileID": "4261fd80-a26f-49e8-a0a4-1c23f8a3e4a6", "error": "Waiting for storagecluster.ocs.openshift.io/v1 ocs-storagecluster to be deleted"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:224
(venv3.9) ➜ 269
Version of all relevant components (if applicable):
OC version:
Client Version: 4.16.11
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: 4.17.0-0.nightly-2024-09-25-061141
Kubernetes Version: v1.30.4
OCS version:
ocs-operator.v4.17.0-108.stable OpenShift Container Storage 4.17.0-108.stable Succeeded
Cluster version:
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.17.0-0.nightly-2024-09-25-061141 True False 15h Cluster version is 4.17.0-0.nightly-2024-09-25-061141
Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Is there any workaround available to the best of your knowledge?
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
Can this issue reproducible?
1/1
Can this issue reproduce from the UI?
yes
If this is a regression, please provide more details to justify this:
Steps to Reproduce:
1. Install cluster on vSphere, ensure both ceph and object storage are available
2. Open management-console and uninstall ODF operator. Set to uninstall operands.
3.
Actual results:
uninstall stacks indefinitely
Expected results:
uninstall finished and no deployments, storagecluster, storagesystem, leftovers
Additional info: