Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-975

ActiveMQ split-brain after SyncFailedException on NFS filesystem

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • JBoss A-MQ 6.2
    • JBoss A-MQ 6.1
    • broker

    Description

      In a master-slave A-MQ set-up, with a shared filesystem, a split-brain (dual master) situation is observed after a network-level failure affecting the connection between the broker host and the storage.Looking at the attached master.log and slave.log, we can see that the master receives a SyncFailedException when trying to flush the KahaDB file. The master then tries to shut down, but seems to be unable to, owing to a bunch of I/O-related problems. However, it appears to have dropped its filesystem lock, because the slave reports that it has locked the file and is coming up. You can see from the master and slave logs that both A-MQ instances are processing the same corrupt KahaDB file. Both are now "masters" in some sense.I surmise that the SyncFailedException is not handled properly here – because the filesystem connection is defective in some way at this point, exceptions are thrown whilst trying to close down, and the master remains a master, even though the slave has taken over the master role.

      Attachments

        1. master.log
          436 kB
        2. slave.log
          1.00 MB
        3. test-master.log
          11 kB
        4. test-slave.log
          4 kB

        Issue Links

          Activity

            People

              gtully@redhat.com Gary Tully
              rhn-support-kboone Kevin Boone
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: