Uploaded image for project: 'FlightPath'
  1. FlightPath
  2. FLPATH-2793

Helm chart storage class auto-detection selects incorrect storage class for OpenShift database workloads

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      Description

      The ROS Helm chart's storage class auto-detection logic incorrectly selects the cluster's default storage class instead of the platform-appropriate storage class for database workloads on OpenShift. This results in database pods using CephFS (ocs-storagecluster-cephfs) instead of RBD block storage (ocs-storagecluster-ceph-rbd), which can cause deployment failures and is not optimal for database performance.

      Issue Details

      The Helm chart template helper ros-ocp.storageClass in ros-ocp/templates/_helpers.tpl prioritizes the cluster's default storage class over the platform-specific default. When an OpenShift cluster has ocs-storagecluster-cephfs set as the cluster default, the chart incorrectly uses this shared filesystem storage for database StatefulSets instead of the more appropriate ocs-storagecluster-ceph-rbd block storage.

      Additionally, the template includes an inline warning comment that breaks YAML parsing when a user-specified storage class is not found, causing helm install/upgrade to fail with:

      error converting YAML to JSON: yaml: line 89: could not find expected ':'
      

      Steps to Reproduce

      Deploy on an OpenShift cluster with OCS/ODF installed where:

      • ocs-storagecluster-cephfs is marked as the cluster default storage class
      • ocs-storagecluster-ceph-rbd exists but is not the default

      2. Run helm install without explicitly setting global.storageClass:

         export JWT_AUTH_ENABLED=true
         ./install-helm-chart.sh
         

      3. Observe that database StatefulSets attempt to use ocs-storagecluster-cephfs instead of ocs-storagecluster-ceph-rbd

      Expected Behavior

      The Helm chart should:
      1. Prioritize platform-specific storage class defaults over cluster defaults
      2. For OpenShift with OCS/ODF: Use ocs-storagecluster-ceph-rbd for database workloads
      3. Fail gracefully with clear error messages if the required storage class doesn't exist
      4. Never output inline warnings that break YAML syntax

      Actual Behavior

      The chart:
      1. Uses cluster default storage class (ocs-storagecluster-cephfs) even when platform-specific default exists
      2. Outputs inline warning comments that break YAML parsing
      3. Results in suboptimal storage selection for database workloads

      Error Messages

      YAML parse error when user-specified storage class not found:

      Error: YAML parse error on ros-ocp/templates/statefulset-db-kruize.yaml: line 89: could not find expected ':'
      

      This occurs because the template outputs:

      storageClassName: # Warning: Storage class 'xxx' not found, using default 'yyy' instead
      yyy
      

      Fix Options

      Reorder storage class selection logic: Check if platform-specific default exists in the cluster before falling back to cluster default

      2. Remove inline warnings: Use fail function instead of inline comments to prevent YAML syntax errors
      3. Update documentation: Clarify that database workloads require block storage (RBD) not shared filesystem storage (CephFS)

      Version Information

      _ _Helm Chart*: ros-ocp version 0.1.6
      _ _Git Commit*: 3235f9b2b74b7097097186e7be586c796927a67f
      _ _Git Branch*: fix/storage-class-name
      _ _Affected Files*:

      • ros-ocp/templates/_helpers.tpl (lines 287-324)
      • All database StatefulSet templates that use ros-ocp.databaseStorageClass helper
        _ _Affected Version*: IOP-POC-0.1
        _ _Platform*: OpenShift with OCS/ODF storage

      Additional Context

      The deployment scripts (scripts/deploy-strimzi.sh) correctly hardcode ocs-storagecluster-ceph-rbd for OpenShift, but the Helm chart's auto-detection logic does not follow the same pattern. This creates an inconsistency where Kafka uses the correct storage class but the application databases do not.

      Workaround

      Until this issue is fixed, bypass the auto-detection logic by using helm install directly with explicit storage class configuration:

      Basic workaround (without JWT authentication):

      helm install ros-ocp ./ros-ocp \
        --namespace chadtest \
        --set global.storageClass=ocs-storagecluster-ceph-rbd \
        --wait
      

      Workaround with JWT authentication enabled:

      helm upgrade --install ros-ocp ./ros-ocp \
        --namespace chadtest \
        --set global.storageClass=ocs-storagecluster-ceph-rbd \
        --set ingress.auth.enabled=true \
        --set ingress.upload.requireAuth=true \
        --wait
      

      Notes:

      • This bypasses the install-helm-chart.sh script entirely, avoiding the buggy auto-detection
      • The global.storageClass setting ensures all database StatefulSets use the correct RBD block storage
      • If needed, individual database storage classes can be overridden:
      • --set database.ros.storage.class=ocs-storagecluster-ceph-rbd
      • --set database.kruize.storage.class=ocs-storagecluster-ceph-rbd
      • --set database.sources.storage.class=ocs-storagecluster-ceph-rbd
      • JWT authentication requires Keycloak/RHSSO to be deployed first (./scripts/deploy-rhsso.sh)
      • Ensure Kafka/Strimzi are deployed before installing ROS (./scripts/deploy-strimzi.sh)

              Unassigned Unassigned
              chadcrum Chad Crum
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: