-
Bug
-
Resolution: Done
-
Major
-
7.0.0.Alpha4
InboundInvocationHandlerImpl is supposed to ignore commands sent with a topology id smaller than the first topology id in which the local node was a member. But there is a loophole when the command was sent with topology id 0.
This is visible in StateTransferFunctionalTest, where the writing thread keeps the cpu busy and can delay the 2nd node joining for a long time (especially when run on a single core with taskset -c 0). For some reason, the PrepareCommands are sent only to the local node, while the TxCompletionNotificationCommands are sent to the entire cluster (null). When the 2nd node manages to join, it receives a lot of TxCompletionNotificationCommands and processing them delays the processing of the rebalance commands. Since the writes eventually block waiting for the new topology to be installed on the joiner, the delayed rebalance commands cause the write to time out.