-
Bug
-
Resolution: Done
-
Major
-
5.0.1.FINAL, 5.1.0.BETA2
-
None
With the blocking state transfer we have since 5.0, we can sometimes have a deadlock between the state transfer process and executing write/prepare/commit commands.
The commands need to acquire the state transfer lock on the originator and on the key owners, in this order, and the state transfer also needs to acquire the state transfer lock on these nodes but in an undefined order (see ISPN-1106).
This is solved by failing fast when acquiring the command's state transfer lock on the remote node, but this means that a write command can fail with a RehashInProgressException and the user is forced to retry.
We can do better and retry the command ourselves after waiting for the state transfer to end.