Status: Closed (View Workflow)
Here's the sequence that might cause the leak:
- a tx locking K is prepared and committed on node A
- node B joins and becomes the primary owner of K
- tx and the lock on key is moved from A to B as part of the state transfer
- tx completion notification for tx is sent (async+oob) to A and not to B(old view)
- there's no way for A to reply to the the originator telling it to re-send the tx completion notification to B as the call is async+oob
- this would cause the lock transaction to leak on B
After a chat with Dan, following fixed seemed to be the most appropriate:
when receiving a tx completion notification for a tx that's being migrated over as result of state transfer forward it to the new lock owner. With current blocking ST, this fwd call waits only after the tx state is transferred to the new owner. This would need some more thought for the new NBST code.
Note: there's no issue in the case of nodes leaving the cluster, as the current logic of backup nodes would assure a proper cleanup.