-
Bug
-
Resolution: Done
-
Critical
-
None
-
Documentation (Ref Guide, User Guide, etc.), Release Notes
-
-
-
Documented as Resolved Issue
-
-
AMQ Broker 1836, AMQ Broker 1839
When testing failover in a scenario with 1 master and 2 slaves, the example scenario in which the master is killed first worrks correctly - the primary backup becomes the master and the secondary backup becomes the replication node.
If, however, the primary backup is killed first, the secondary backup remains stopped and does not announce as the replication slave. Instead it continues to log:
13:31:44,373 WARN [org.apache.activemq.artemis.core.server] AMQ222040: Server is stopped
When the master is brought down, the secondary slave remains stopped.
Looking at the thread dumps of the secondary backup for this scenario, (taken when the primary is killed), it appears the secondary is stuck looping in NamedLiveNodeLocatorForReplication::locateNode(...).
"AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null" #18 prio=5 os_prio=0 tid=0x00007f1920803800 nid=0x642b waiting on condition [0x00007f19028e8000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000c04b7170> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at org.apache.activemq.artemis.core.server.impl.NamedLiveNodeLocatorForReplication.locateNode(NamedLiveNodeLocatorForReplication.java:67) at org.apache.activemq.artemis.core.server.impl.NamedLiveNodeLocatorForReplication.locateNode(NamedLiveNodeLocatorForReplication.java:54) at org.apache.activemq.artemis.core.server.impl.SharedNothingBackupActivation.run(SharedNothingBackupActivation.java:195) at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$ActivationThread.run(ActiveMQServerImpl.java:2793) Locked ownable synchronizers: - None
If multiple slaves are configured for a master, nth slave should become the active slave if the current slave(s) are offline.
This is https://issues.apache.org/jira/browse/ARTEMIS-2075 upstream
- is cloned by
-
ENTMQBR-1954 Create test for Standby slave does not announce replication to master when primary slave is down
- Closed
- is duplicated by
-
ENTMQBR-1021 [HA, MS1S2] When backup slave1 is killed, slave2 can't take role of backup, leaving HA on master only
- Closed
- is related to
-
ARTEMIS-1285 Loading...