Uploaded image for project: 'Red Hat Internal Developer Platform'
  1. Red Hat Internal Developer Platform
  2. RHIDP-5342

[Helm] Cannot run 2 RHDH replicas on different nodes due to Multi-Attach errors on the dynamic plugins root PVC

Prepare for Y ReleasePrepare for Z ReleaseRemove QuarterXMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 1.3, 1.3.1, 1.3.2, 1.3.3, 1.4
    • Helm Chart
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide
      If you are deploying {product-short} using the Helm Chart, it is currently impossible to have 2 replicas running on different cluster nodes. This might also affect the upgrade from 1.3 to 1.4.0 if the new pod is scheduled on a different node.

      A possible workaround for the upgrade is to manually scale down the number of replicas to 0 before upgrading your Helm release. Or manually remove the old {product-short} pod after upgrading the Helm release. However, this would imply some application downtime.
      You can also leverage a Pod Affinity rule to force the cluster scheduler to run your {product-short} pods on the same node.
      Show
      If you are deploying {product-short} using the Helm Chart, it is currently impossible to have 2 replicas running on different cluster nodes. This might also affect the upgrade from 1.3 to 1.4.0 if the new pod is scheduled on a different node. A possible workaround for the upgrade is to manually scale down the number of replicas to 0 before upgrading your Helm release. Or manually remove the old {product-short} pod after upgrading the Helm release. However, this would imply some application downtime. You can also leverage a Pod Affinity rule to force the cluster scheduler to run your {product-short} pods on the same node.
    • Known Issue
    • Done

      Description of problem:

      It doesn't seem possible to have 2 replicas of a Helm-based RHDH instance running on different nodes, as one would expect for typical HA deployments.

      Prerequisites (if any, like setup, operators/versions):

      • Helm Chart 1.3.0 and 1.4.0
      • Tested an a ROSA 4.17 cluster with at least 2 nodes

      Steps to Reproduce

      1. Check that you have at least 2 nodes available in the cluster, e.g.:
      $ oc get nodes                                  
      NAME                        STATUS   ROLES    AGE   VERSION
      ip-10-0-1-11.ec2.internal   Ready    worker   72m   v1.30.6
      ip-10-0-1-71.ec2.internal   Ready    worker   72m   v1.30.6 
      1. Create a values file with a topology spread constraint that enforces the scheduling of those replicas on different nodes:
      # my-values-topology-spread-constraints.yaml
      upstream:
        backstage:
          replicas: 2
          topologySpreadConstraints:
            - maxSkew: 1
              topologyKey: kubernetes.io/hostname
              whenUnsatisfiable: DoNotSchedule
              labelSelector:
                matchLabels:
                  app.kubernetes.io/instance: my-backstage-1
      1. deploy RHDH using Helm and provide the specified values file, e.g.:
      $ git clone https://github.com/redhat-developer/rhdh-chart.git && cd rhdh-chart
      $ helm upgrade --install my-backstage-1 \
          charts/backstage \
          --set global.clusterRouterBase=`oc get ingress.config.openshift.io/cluster '-o=jsonpath={.spec.domain}'` \
          --values my-values-topology-spread-constraints.yaml

      Actual results:

      Only 1 replica will be running. The second one will get stuck on a Multi-Attach Error:

      $ oc get deploy my-backstage-1                                                                                   
      NAME             READY   UP-TO-DATE   AVAILABLE   AGE
      my-backstage-1   1/2     2            1           23m 

       

      Topology Spread Constraints:  kubernetes.io/hostname:DoNotSchedule when max skew 1 is exceeded for selector app.kubernetes.io/instance=my-backstage-1
      Events:
        Type     Reason              Age    From                     Message
        ----     ------              ----   ----                     -------
        Warning  FailedScheduling    8m26s  default-scheduler        0/2 nodes are available: 1 node(s) didn't find available persistent volumes to bind, 1 node(s) didn't match pod topology spread constraints. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
        Warning  FailedScheduling    8m25s  default-scheduler        0/2 nodes are available: 1 node(s) didn't find available persistent volumes to bind, 1 node(s) didn't match pod topology spread constraints. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
        Normal   Scheduled           8m23s  default-scheduler        Successfully assigned my-ns/my-backstage-1-744b9f4bb-8t8h5 to ip-10-0-1-71.ec2.internal
        Warning  FailedAttachVolume  8m19s  attachdetach-controller  Multi-Attach error for volume "pvc-319f0c25-2f15-4320-83db-5f55e7a2c2fb" Volume is already used by pod(s) my-backstage-1-744b9f4bb-bzfdk

      The list of PVCs, to confirm that this is about the dynamic plugins root PVC:

      $ oc get pvc                  
      NAME                                        STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
      data-my-backstage-1-postgresql-0            Bound         pvc-aecafeaf-cc58-45bb-a421-19f5624a4e0a   1Gi        RWO            gp3-csi        <unset>                 30m
      my-backstage-1-dynamic-plugins-root         Bound         pvc-319f0c25-2f15-4320-83db-5f55e7a2c2fb   5Gi        RWO            gp3-csi        <unset>                 30m
       

      Expected results:

      Both replicas should be running.

      Reproducibility (Always/Intermittent/Only Once):

      Always.

      A user reported a similar issue when upgrading a 1.3 Helm release (see RHDHBUGS-135). We also noticed something similar when upgrading an Helm-based instance from 1.3 to 1.4 (see https://redhat-internal.slack.com/archives/C04CUSD4JSG/p1734467111777559 ).

      This might happen if the cluster scheduler assigns the new pod to a different node.

      Build Details:

      Helm 1.3 and 1.4

      Additional info (Such as Logs, Screenshots, etc):

      This seems to be caused by mounting the dynamic plugins root PVC as RWO by default (RHIDP-3572).

      The Operator is not affected because it does not create a dynamic plugins root PVC out of the box.

              Unassigned Unassigned
              rh-ee-asoro Armel Soro
              RHIDP - Install
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: