Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-5378

Node stuck in 'Joining' state after network degradation

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Undefined Undefined
    • None
    • rhel-8.5.0
    • galera
    • None
    • None
    • rhel-sst-cs-databases
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:
      The internet connection of one of our Galera nodes was severely degraded for around 30 minutes. The high (~90%) packet loss caused the node to lose synchronization with the rest of the cluster. After the internet connection was restored, the node stayed in the 'Joining: receiving State Transfer' state instead of catching up and synchronizing with the rest of the cluster.

      Version-Release number of selected component (if applicable):
      galera.x86_64 25.3.32-1.module+el8.3.0+10472+7adc332a

      How reproducible:
      Unknown.

      Actual results:
      I had to 'kill -9' the mysqld process (gracefully stopping the daemon didn't work). I could then start the mariadb service like normal, and the node successfully synced again.

      Expected results:
      The Galera node should successfully resync once connectivity has been restored. No manual intervention should be required.

      Additional info:
      Upstream bug report: https://jira.mariadb.org/browse/MDEV-21002

              mschorm@redhat.com Michal Schorm
              cmimre Imre Jonk (Inactive)
              bot rhel-cs-apps-subsystem-qe bot rhel-cs-apps-subsystem-qe
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: