-
Bug
-
Resolution: Done
-
Major
-
7.1.1.Final
-
None
Transaction gets recommited after a node joins a cluster during merge and requests retransmission of previous messages, overwriting updates which happened till that point.
4 nodes - edg-perf01-04
Scenario:
1. edg-perf01 begins a new transaction & updates value of K
06:50:50,421 TRACE [org.infinispan.transaction.xa.TransactionXaAdapter] (DefaultStressor-7) start called on tx GlobalTransaction:<edg-perf01-19349>:3516:local
The latest view on edg-perf01 is
06:50:35,103 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,edg-perf01-19349) ISPN000093: Received new, MERGED cluster view: MergeView::[edg-perf01-19349|5] (3) [edg-perf01-19349, edg-perf04-20753, edg-perf02-19191], 1 subgroups: [edg-perf01-19349|4] (2) [edg-perf01-19349, edg-perf02-19191]
2. Prepare command is sent with nodes edg-perf02 & edg-perf04 responding successfully
06:50:51,186 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (DefaultStressor-7) edg-perf01-19349 invoking PrepareCommand ... topologyId=12} to recipient list [edg-perf02-19191, edg-perf04-20753, edg-perf01-19349] ...
check message id & seqno
(06:50:51,195 TRACE [org.jgroups.protocols.TCP] (DefaultStressor-7) edg-perf01-19349: sending msg to null, src=edg-perf01-19349, headers are RequestCorrelator: id=200, type=REQ, id=17134, rsp_expected=true, exclusion_list=[edg-perf01-19349], NAKACK2: [MSG, seqno=1020], 20903: slaveIndex=0, TCP: [cluster_name=default])
3. Transaction gets commited & value of K gets updated multiple times on edg-perf01 from this point on
4. edg-perf03 receives a new view containing all nodes
06:50:53,963 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,edg-perf03-14171) ISPN000093: Received new, MERGED cluster view: MergeView::[edg-perf01-19349|6] (4) [edg-perf01-19349, edg-perf03-14171, edg-perf04-20753, edg-perf02-19191], 3 subgroups: [edg-perf03-14171|4] (2) [edg-perf03-14171, edg-perf04-20753], [edg-perf01-19349|5] (3) [edg-perf01-19349, edg-perf04-20753, edg-perf02-19191], [edg-perf01-19349|4] (2) [edg-perf01-19349, edg-perf02-19191]
5. edg-perf03 apparently requests retransmission of previous messages (containing prepare from step#2) & edg-perf01 sends the response
Request
06:51:35,241 TRACE [org.jgroups.protocols.pbcast.NAKACK2] (Timer-3,edg-perf03-14171) edg-perf03-14171: sending XMIT_REQ ((79): {1015-1093}) to edg-perf01-19349
Response
06:51:35,299 TRACE [org.jgroups.protocols.TCP] (INT-14,edg-perf03-14171) edg-perf03-14171: received [dst: edg-perf03-14171, src: edg-perf01-19349 (4 headers), size=975 bytes, flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER|INTERNAL], headers are RequestCorrelator: id=200, type=REQ, id=17134, rsp_expected=true, exclusion_list=[edg-perf01-19349], NAKACK2: [XMIT_RSP, seqno=1020], 20903: slaveIndex=0, TCP: [cluster_name=default]
6. edg-perf03 prepares the old transaction & eventually commits it again
06:51:35,304 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] (INT-14,edg-perf03-14171) Attempting to execute command: PrepareCommand.....gtx=GlobalTransaction:<edg-perf01-19349>:3516:local, cacheName='testCache', topologyId=12} [sender=edg-perf01-19349] 06:51:35,304 TRACE [org.infinispan.statetransfer.StateTransferLockImpl] (INT-14,edg-perf03-14171) Checking if transaction data was received for topology 12, current topology is 14 06:51:35,305 TRACE [org.infinispan.transaction.TransactionTable] (remote-thread-40) Created and registered remote transaction ... tx=GlobalTransaction:<edg-perf01-19349>:3516:remote, ... 06:51:35,308 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (remote-thread-40) edg-perf03-14171 invoking PrepareCommand ... gtx=GlobalTransaction:<edg-perf01-19349>:3516:remote, cacheName='testCache', topologyId=15} to recipient list [edg-perf04-20753, edg-perf02-19191] ...
Successful commit responses
06:51:36,750 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (remote-thread-44) Response(s) to CommitCommand {gtx=GlobalTransaction:<edg-perf01-19349>:3516:remote, cacheName='testCache', topologyId=15} is {edg-perf02-19191=SuccessfulResponse{responseValue=null} , edg-perf04-20753=SuccessfulResponse{responseValue=null} }
7. When get(K) is invoked on edg-perf01, stale value is returned (the one from step #6, ignoring updates which have happened since step #3)
- is incorporated by
-
ISPN-4546 Possible stale lock when the primary owner leaves during rebalance
- Closed