Basic resilience scenario in library mode:
- spawn 4 nodes
- kill one node (kill = insert DISCARD protocol and then call CacheManger.stop())
- start one node
We use RadarGun (with some modifications) for this test.
When the node is started again the system looses some messages as some timeouts are triggered (and the RadarGun stage acknowledgement is missing as well) and later put/get requests cause exceptions.
The trace logs are too big, therefore, these are published on http://dl.dropbox.com/u/103079234/serverlogs_trace.zip
Note that the test was shutdown manually.