Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-17410

Define resources requests and limits on Galera pods to avoid outage in case of cluster Scale

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • rhos-18.0.12
    • rhos-18.0 FR 2 (Mar 2025)
    • mariadb-operator
    • None
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • mariadb-operator-container-1.0.14-1
    • None
    • Sprint 2, Sprint 3, Sprint 4, Sprint 5
    • 4
    • Critical

      To Reproduce Steps to reproduce the behavior:

      1. Having a RHOSO testing cluster functioning correctly
      2. Doing faulty test unintentionally and scaling a deployment to 300 replicas
      3. System resources were exhausted
      4. Galera cluster became non-functional

      Expected behavior

      • Galera cluster should continue working through its resource reservations.
      • Spawning 300 replicas of pods should fail, at least partially, if cluster remaining resources does not allow it. This scale should not impact the 

      Device Info (please complete the following information):

      • RHOSO FR2 deployment in DCN/DZ environment, with 3 AZs, and so 3 Cells (besides the Cell0)

      Bug impact

      • Galera cluster became non-functional
      • The Galera operator is unable to recover or rebuild the cluster after a full outage. 
      • The customer has a huge concern about how they can use RHOSO in our production environment.
         

      Known workaround

      • This was not tested, but may be explored: Set up values on reserved and limits ressources (RAM, CPU ..) of Galera pods through infra-operator. This way we garantie resource reservation to these pods preventing the case of galera outage in case of clusters scale

      Additional context

      • <your text here>

              rhn-engineering-dciabrin Damien Ciabrini
              smsallem@redhat.com Soumaya Msallem
              rhos-dfg-pidone
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: