Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI: StorageCluster is stuck on Progressing on BareMetal cluster. Looks like it's waiting forever for the StorageClasses to be created, but they never do.
The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI): Bare Metal
The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc): Internal-Attached
The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):
OCP - 4.18.0-ec.3
ODF - 4.18.0-45.stable (quay.io/rhceph-dev/ocs-registry:latest-stable-4.18)
Does this issue impact your ability to continue to work with the product? Yes
Is there any workaround available to the best of your knowledge? I tried to create the StorageClasses before the StorageCluster, and it didn't seem to work, so NO.
Can this issue be reproduced? If so, please provide the hit rate: 99%
Can this issue be reproduced from the UI? I believe so
If this is a regression, please provide more details to justify this:
Steps to Reproduce:
1. Deploy ODF on Bare Metal cluster
2. While waiting to StorageCluster to become available, notice the ODF StorageClasses are not created
3.
The exact date and time when the issue was observed, including timezone details: roughly Oct 31st at 11:03 AM Israel Time Zone, maybe happened earlier, that's when the related slack thread started
Actual results:
StorageCluster is stuck in Progressing:
% oc get storagecluster -A NAMESPACE NAME AGE PHASE EXTERNAL CREATED AT VERSION openshift-storage ocs-storagecluster 2d21h Progressing 2024-11-07T13:36:49Z 4.18.0
with this error in its condition:
Message: Error while reconciling: some StorageClasses were skipped while waiting for pre-requisites to be met: [ocs-storagecluster-cephfs,ocs-storagecluster-ceph-rbd]
and indeed no ODF StorageClass is ever created:
% oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE local-block-ocs kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 2d22h
Expected results: ODF's StorageClasses should be created and StorageCluster should become available after ~15 minutes
Logs collected and log location: attached must-gather logs
Additional info: