Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-2007

HA Shared store with JDBC DB - slave after failback to master is stuck - unusable

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Critical Critical
    • None
    • AMQ 7.2.1.GA
    • high-availability, jdbc
    • None
    • Workaround Exists
    • Hide

      Set restart-broker to True on slave node.

      Show
      Set restart-broker to True on slave node.
    • Hide
      manualtest.steps = [
                  step('Make sure HA is running properly.', 'HA is running properly.'),
                  step('Subscribe receiver to master broker on queue.',
                       'Receiver is ready to receive messages from master broker.'),
                  step('Send 100 messages to queue on master.', 'Some of the messages are sent to master broker.'),
                  step('Wait few seconds and kill master broker while sender is sending messages.',
                       'Master broker is killed, client reconnect/failover works with slave.'),
                  step('Sender and Receiver finished successfully.',
                       'All messages are sent & received from master and slave broker.'),
                  step('Compare send and received messages.', 'All sent and received messages are same.'),
                  step('Start master broker again and make sure it is live.', 'Master broker is started and is live.'),
              ]
      
      Show
      manualtest.steps = [ step( 'Make sure HA is running properly.' , 'HA is running properly.' ), step( 'Subscribe receiver to master broker on queue.' , 'Receiver is ready to receive messages from master broker.' ), step( 'Send 100 messages to queue on master.' , 'Some of the messages are sent to master broker.' ), step( 'Wait few seconds and kill master broker while sender is sending messages.' , 'Master broker is killed, client reconnect/failover works with slave.' ), step( 'Sender and Receiver finished successfully.' , 'All messages are sent & received from master and slave broker.' ), step( 'Compare send and received messages.' , 'All sent and received messages are same.' ), step( 'Start master broker again and make sure it is live.' , 'Master broker is started and is live.' ), ]

      When running HA Shared Store test below, the slave node, which becomes active and goes down, due to its master becoming live again, is not operable anymore. It's process is up, hawtio supposedly is up as well, but user can't access it.

      It can't acquire backup node lock as well. Missing AMQ221033: ** got backup lock message.

      This seems to be cause only if default value of restart-broker is set to false.
      Once I set it to true, slave (probably restarted itself after failback) and was accessible and worked fine.

      I had a help from fnigro to debug this issue.

              rh-ee-ataylor Andy Taylor
              mtoth@redhat.com Michal Toth
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: