-
Bug
-
Resolution: Done
-
Major
-
None
-
None
If num_discovery_runs is greater than 1, then sometimes startup of a member blocks.
The stack trace below indicates this is an issue with comparison of Task (Future) elements in the ConcurrentSkipListSet.
Solution: replace the set with an ArrayList: there is no need to sort the futures, or avoid duplicates, as the list is only added to or cleared.
"main" #1 prio=5 os_prio=31 tid=0x00007ff886001800 nid=0x1c03 runnable [0x000070000f96f000] java.lang.Thread.State: RUNNABLE at java.util.concurrent.ConcurrentSkipListMap.findPredecessor(ConcurrentSkipListMap.java:684) at java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:823) at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1979) at java.util.concurrent.ConcurrentSkipListSet.add(ConcurrentSkipListSet.java:241) at org.jgroups.protocols.Discovery.findMembers(Discovery.java:235) at org.jgroups.protocols.Discovery.down(Discovery.java:380) at org.jgroups.protocols.MERGE3.down(MERGE3.java:278) at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:377) at org.jgroups.protocols.FD_ALL.down(FD_ALL.java:235) at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:102) at org.jgroups.protocols.BARRIER.down(BARRIER.java:136) at org.jgroups.protocols.pbcast.NAKACK2.down(NAKACK2.java:553) at org.jgroups.protocols.UNICAST3.down(UNICAST3.java:581) at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:347) at org.jgroups.protocols.pbcast.ClientGmsImpl.joinInternal(ClientGmsImpl.java:72) at org.jgroups.protocols.pbcast.ClientGmsImpl.join(ClientGmsImpl.java:40) at org.jgroups.protocols.pbcast.GMS.down(GMS.java:1044) at org.jgroups.protocols.FlowControl.down(FlowControl.java:295) at org.jgroups.protocols.FlowControl.down(FlowControl.java:295) at org.jgroups.protocols.FRAG2.down(FRAG2.java:141) at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:928) at org.jgroups.JChannel.down(JChannel.java:627) at org.jgroups.JChannel._connect(JChannel.java:855) at org.jgroups.JChannel.connect(JChannel.java:352) - locked <0x000000079e04cfa0> (a org.jgroups.JChannel) at org.jgroups.JChannel.connect(JChannel.java:343) - locked <0x000000079e04cfa0> (a org.jgroups.JChannel) at org.jgroups.tests.bla6.start(bla6.java:41) at org.jgroups.tests.bla6.main(bla6.java:54)