Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Blocker
Fix Version/s: JBoss A-MQ 6.1
Affects Version/s: JBoss A-MQ 6.0
Component/s: None
Labels:
None
Environment:
- test in jboss-a-mq-6.0.0.redhat-024

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

When a zookeeper session is expired and a new session created from the container (as below) the container is listed as "active" in the container-list but the cluster-list does not list the broker running within that container.| 2013-06-22 00:23:27,246 | INFO| .40.26.207:2181) | ClientCnxn | .zookeeper.ClientCnxn$SendThread 1049 | 58 - org.fusesource.fabric.fabric-linkedin-zookeeper -|

7.2.0.redhat-024 | Unable to reconnect to ZooKeeper service, session 0x23f655f62400001 has expired, closing socket connection

7.2.0.redhat-024 | Session establishment complete on server <myip_address>:2181, sessionid = 0x23f655f6240001c, negotiated timeout = 30000

I am assuming this is because the ephemeral node for the broker cluster is not recreated when the zookeeper session is restarted after expiry. I think this behavior is problematic:1. potential loss of slave instances from the cluster groupzookeeper session expires on slave instanceephemeral zknode is removed as it is associated with that sessionnew zookeeper session is created but the ephemeral node is not recreated in the clusterthe instance will not be "discovered" as part of the mq-discovery mechanism as no node is registered in zookeeper2. potentially have two active brokers in the cluster group (two masters)zookeeper session expires on master instanceephemeral zknode is removed as it is associated with that sessionnew zookeeper session is created but the ephemeral node is not recreated in the clusterslave broker is promoted to masteroriginal master broker is still running (but is not listed in the group cluster).HOW TO REPLICATE
=============(scenario 1):
-----------------issue following karaf/fabric commands| 1.fabric:create|

2.fabric:mq-create --group mq_g50 --create-container child_1,child_2 my_mq_profile

Assuming child_1 is the master; pause container child_2 for >30 seconds (using the "kill -17 PID" to pause and "kill -19 PID" to resume)| 3. container-list - will show child_2 container as active again (as expected)|

4. cluster-list - will show no reference to child_2 broker

(Scenario 2)
--------------------
setup same as scenario 1 BUT
1. ensure the kahadb is not sharing the same master slave lock
2. pause master container rather than slave.

is related to

ENTMQ-408 Need a solution for ENTMQ-382 on JBoss Fuse 6.0

Closed

Assignee:: Dejan Bosanac

Reporter:: Patrick Fox (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2013/06/26 9:42 AM

Updated:: 2021/01/04 11:24 AM

Resolved:: 2013/10/14 2:15 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates