Uploaded image for project: 'Data Foundation Bugs'
  1. Data Foundation Bugs
  2. DFBUGS-818

4.18: StorageCluster is stuck on Progressing on BareMetal cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • odf-4.18
    • rook
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • ?
    • ?
    • Critical
    • None

      Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI: StorageCluster is stuck on Progressing on BareMetal cluster. Looks like it's waiting forever for the StorageClasses to be created, but they never do.

       

       

      The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI): Bare Metal

       

      The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc): Internal-Attached

       

       

      The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):

      OCP - 4.18.0-ec.3

      ODF - 4.18.0-45.stable (quay.io/rhceph-dev/ocs-registry:latest-stable-4.18)

       

       

      Does this issue impact your ability to continue to work with the product? Yes

       

       

      Is there any workaround available to the best of your knowledge? I tried to create the StorageClasses before the StorageCluster, and it didn't seem to work, so NO.

       

       

      Can this issue be reproduced? If so, please provide the hit rate: 99%

       

       

      Can this issue be reproduced from the UI? I believe so

      If this is a regression, please provide more details to justify this:

      Steps to Reproduce:

      1. Deploy ODF on Bare Metal cluster

      2. While waiting to StorageCluster to become available, notice the ODF StorageClasses are not created

      3.

      The exact date and time when the issue was observed, including timezone details: roughly Oct 31st at 11:03 AM Israel Time Zone, maybe happened earlier, that's when the related slack thread started

       

      Actual results:

      StorageCluster is stuck in Progressing:

      %  oc get storagecluster -A
      NAMESPACE           NAME                 AGE     PHASE         EXTERNAL   CREATED AT             VERSION
      openshift-storage   ocs-storagecluster   2d21h   Progressing              2024-11-07T13:36:49Z   4.18.0 

      with this error in its condition:

          Message:               Error while reconciling: some StorageClasses were skipped while waiting for pre-requisites to be met: [ocs-storagecluster-cephfs,ocs-storagecluster-ceph-rbd] 

      and indeed no ODF StorageClass is ever created:

       % oc get sc
      NAME              PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
      local-block-ocs   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  2d22h
      

       

      Expected results: ODF's StorageClasses should be created and StorageCluster should become available after ~15 minutes

       

      Logs collected and log location: attached must-gather logs

       

      Additional info:

       

              rh-ee-mrudraia Marulasiddaiah Rudraiah
              mperetz@redhat.com Maya Peretz
              Jilju Joy Jilju Joy
              Votes:
              0 Vote for this issue
              Watchers:
              25 Start watching this issue

                Created:
                Updated: