-
Task
-
Resolution: Done
-
Critical
-
5.5.2
-
None
-
False
-
-
False
-
-
The BasicTCP.enableSuspectEvents feature currently lacks any test coverage, even for scenarios involving graceful node departures from the cluster. This test gap has allowed critical issues to exist undetected and uncovered in WildFly.
Test Reproducer:
A new test EnableSuspectEventsTest.testRepeatedLeaveAndJoin() has been created to reproduce this scenario. The test:
- Creates a 3-node cluster using TCP transport with enableSuspectEvents(true)
- Repeatedly makes the coordinator leave gracefully (simulates node shutting down)
- Verifies that remaining nodes remain together in a single cluster (not forming singleton clusters like we see in WF testsuite)
- Uses TCPPING with numDiscoveryRuns(3) to reliably trigger the race condition (not sure why is that required?)
- Runs for 500 iterations to ensure stability
- The test mimics a typical WildFly cluster configuration
Expected Behavior when the coordinator leaves gracefully:
- All remaining nodes should stay in the same cluster (no singleton clusters)
- Connection closures should NOT trigger spurious SUSPECT events that cause cluster splits
- informs
-
JGRP-2968 BasicTCP.enableSuspectEvents(true) can cause partitions after graceful leave
-
- Resolved
-