-
Bug
-
Resolution: Done
-
Critical
-
None
-
None
The attached ZIP file has code that reproduces this.
Modify props.props (pf.cluster.transport.protocol=udp) to "tcp" and "tcp-nio" if you want to test the different stacks.
To reproduce:
- Start a number of instances (e.g. 5) concurrently. This almost never works, even under "udp". Joiners' JOIN requests time out and they have to retry (possibly because the coord is busy with the FLUSH protocol). They become singleton members and never merge !
- This works fine without FLUSH
- With "udp", it works almost always, with "tcp" it works 50% of the time, with "tcp-nio" is almost never works
- Randomly kill and restart instances