Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-3426

Master failback using replication should request a quorum vote before starting as live?

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • 7.5.0.CR3
    • None

      Currently a master with HA replication configured to check-for-live-server will check the presence of other nodes in the cluster with the same node ID by:

      • configuring an HA cluster connection using the configured cluster connector list with no reconnection attempts
      • register for the topology on such connection
      • if any member of the topology with the same id is found, trying to create a ClientSession to it

      Spit brain could be exprienced in different cases:

      • if the node we're connecting to receive the topology doesn't have a correct topology (eg slave still becoming live after a previous crash of master)
      • if the node we're connecting to receive the topology is not reachable: master will assume to be the only live

      A possible improvement could be to not rely on the cluster connection topology as the source of truth. but perform a(nother) check requesting a quorum vote to know if it's correct to became live.

              Unassigned Unassigned
              fnigro Francesco Nigro (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: