- 
    
Bug
 - 
    Resolution: Done-Errata
 - 
    
Critical
 - 
    odf-4.16
 - 
    None
 
Description of problem:
----------
On fresh ODF deployment installed on 'odf-storage' namespace the node labels present:
oc get nodes -l cluster.ocs.openshift.io/odf-storage=""
NAME                                       STATUS   ROLES    AGE   VERSION
ip-10-0-0-144.us-west-2.compute.internal   Ready    worker   18h   v1.29.6+aba1e8d
ip-10-0-0-181.us-west-2.compute.internal   Ready    worker   18h   v1.29.6+aba1e8d
ip-10-0-0-45.us-west-2.compute.internal    Ready    worker   21h   v1.29.6+aba1e8d
ip-10-0-0-70.us-west-2.compute.internal    Ready    worker   18h   v1.29.6+aba1e8d
ip-10-0-0-78.us-west-2.compute.internal    Ready    worker   18h   v1.29.6+aba1e8d
ip-10-0-0-95.us-west-2.compute.internal    Ready    worker   21h   v1.29.6+aba1e8d
That triggers an error on StorageCluster
“Not enough nodes found” (screenshot added)
StorageCluster in error state
oc get storagecluster -A
NAMESPACE     NAME                 AGE     PHASE   EXTERNAL   CREATED AT             VERSION
odf-storage   ocs-storagecluster   7m25s   Error              2024-07-31T15:53:39Z   4.16.0
[jenkins@temp-jagent-dosypenk-r217 terraform-vpc-example]$ oc describe  storagecluster ocs-storagecluster -nodf-storage
Name:         ocs-storagecluster
Namespace:    odf-storage
Labels:       <none>
Annotations:  uninstall.ocs.openshift.io/cleanup-policy: delete
              uninstall.ocs.openshift.io/mode: graceful
API Version:  ocs.openshift.io/v1
Kind:         StorageCluster
Metadata:
  Creation Timestamp:  2024-07-31T15:53:39Z
  Finalizers:
    storagecluster.ocs.openshift.io
  Generation:  2
  Owner References:
    API Version:     odf.openshift.io/v1alpha1
    Kind:            StorageSystem
    Name:            ocs-storagecluster-storagesystem
    UID:             2dee21a8-8039-4640-8fd1-9e7a669356b6
  Resource Version:  101564
  UID:               1d04f184-50c7-4f6f-9777-0f197a2fc1d1
Spec:
  Arbiter:
  Encryption:
    Key Rotation:
      Schedule:  @weekly
    Kms:
  External Storage:
  Managed Resources:
    Ceph Block Pools:
    Ceph Cluster:
    Ceph Config:
    Ceph Dashboard:
    Ceph Filesystems:
      Data Pool Spec:
        Application:
        Erasure Coded:
          Coding Chunks:  0
          Data Chunks:    0
        Mirroring:
        Quotas:
        Replicated:
          Size:  0
        Status Check:
          Mirror:
    Ceph Non Resilient Pools:
      Count:  1
      Resources:
      Volume Claim Template:
        Metadata:
        Spec:
          Resources:
        Status:
    Ceph Object Store Users:
    Ceph Object Stores:
    Ceph RBD Mirror:
      Daemon Count:  1
    Ceph Toolbox:
  Mirroring:
  Network:
    Connections:
      Encryption:
    Multi Cluster Service:
  Node Topologies:
  Resource Profile:  lean
  Storage Device Sets:
    Config:
    Count:  1
    Data PVC Template:
      Metadata:
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:         2Ti
        Storage Class Name:  gp3-csi
        Volume Mode:         Block
      Status:
    Name:  ocs-deviceset-gp3-csi
    Placement:
    Portable:  true
    Prepare Placement:
    Replica:  3
    Resources:
Status:
  Conditions:
    Last Heartbeat Time:   2024-07-31T15:53:40Z
    Last Transition Time:  2024-07-31T15:53:40Z
    Message:               Version check successful
    Reason:                VersionMatched
    Status:                False
    Type:                  VersionMismatch
    Last Heartbeat Time:   2024-07-31T15:59:08Z
    Last Transition Time:  2024-07-31T15:53:40Z
    Message:               Error while reconciling: Not enough nodes found: Expected 3, found 0
    Reason:                ReconcileFailed
    Status:                False
    Type:                  ReconcileComplete
    Last Heartbeat Time:   2024-07-31T15:53:40Z
    Last Transition Time:  2024-07-31T15:53:40Z
    Message:               Initializing StorageCluster
    Reason:                Init
    Status:                False
    Type:                  Available
    Last Heartbeat Time:   2024-07-31T15:53:40Z
    Last Transition Time:  2024-07-31T15:53:40Z
    Message:               Initializing StorageCluster
    Reason:                Init
    Status:                True
    Type:                  Progressing
    Last Heartbeat Time:   2024-07-31T15:53:40Z
    Last Transition Time:  2024-07-31T15:53:40Z
    Message:               Initializing StorageCluster
    Reason:                Init
    Status:                False
    Type:                  Degraded
    Last Heartbeat Time:   2024-07-31T15:53:40Z
    Last Transition Time:  2024-07-31T15:53:40Z
    Message:               Initializing StorageCluster
    Reason:                Init
    Status:                Unknown
    Type:                  Upgradeable
  Images:
    Ceph:
      Desired Image:  registry.redhat.io/rhceph/rhceph-7-rhel9@sha256:579e5358418e176194812eeab523289a0c65e366250688be3f465f1a633b026d
    Noobaa Core:
      Desired Image:  registry.redhat.io/odf4/mcg-core-rhel9@sha256:5f56419be1582bf7a0ee0b9d99efae7523fbf781a88f8fe603182757a315e871
    Noobaa DB:
      Desired Image:  registry.redhat.io/rhel9/postgresql-15@sha256:5c4cad6de1b8e2537c845ef43b588a11347a3297bfab5ea611c032f866a1cb4e
  Kms Server Connection:
  Phase:    Error
  Version:  4.16.0
Events:     <none>
[jenkins@temp-jagent-dosypenk-r217 terraform-vpc-example]$ oc get nodes -w
NAME                                       STATUS   ROLES    AGE     VERSION
ip-10-0-0-144.us-west-2.compute.internal   Ready    worker   39m     v1.29.6+aba1e8d
ip-10-0-0-181.us-west-2.compute.internal   Ready    worker   39m     v1.29.6+aba1e8d
ip-10-0-0-45.us-west-2.compute.internal    Ready    worker   3h30m   v1.29.6+aba1e8d
ip-10-0-0-70.us-west-2.compute.internal    Ready    worker   41m     v1.29.6+aba1e8d
ip-10-0-0-78.us-west-2.compute.internal    Ready    worker   43m     v1.29.6+aba1e8d
ip-10-0-0-95.us-west-2.compute.internal    Ready    worker   3h37m   v1.29.6+aba1e8d
---------
Workaround:
oc label node -l node-role.kubernetes.io/worker cluster.ocs.openshift.io/openshift-storage=""
---------
Version-Release number of selected component (if applicable):
ODF full_version: 4.16.0-137
---------
How reproducible:
install ODF on ROSA HCP OCP4.16 cluster
Steps to Reproduce:
1. Install ODF 4.16 on ROSA HCP OCP4.16 cluster
2.
3.
---------
Actual results:
Storage cluster "Not enough nodes found" error. ODF installation stalls, no cephfs, rbd storage classes available
Expected results:
no errors. ODF is available same as ODF on regular AWS cluster  
---------
Additional info:
ODF installation screen recording - https://drive.google.com/file/d/1y84dNkaj68rov9nbJDAlhcnXwc3cJHs_/view?usp=drive_link
Storage System installation screen recording - https://drive.google.com/file/d/12KUnujZmTAAC1H0YqnhXsWjD2PtjRblW/view?usp=sharing
- external trackers
 
- links to
 - 
                    
        
        RHBA-2024:138027
        Red Hat OpenShift Data Foundation 4.18 security, enhancement & bug fix update