Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-1093

If an exception occurs during one phase commit optimization the local cache changes are still applied

    XMLWordPrintable

Details

    • Hide

      I have a test case that will almost always produce the error, but it has some dependencies on our code to compile. I have condensed it down to a an infinispan only test, but it occurs much less frequently - since it is a timing related issue. I don't see a way to attach the file when submitting, but I can later.

      Show
      I have a test case that will almost always produce the error, but it has some dependencies on our code to compile. I have condensed it down to a an infinispan only test, but it occurs much less frequently - since it is a timing related issue. I don't see a way to attach the file when submitting, but I can later.
    • Workaround Exists
    • Hide

      If infinispan is always accompanied with another resource, therefore forcing a 2PC this problem does not occur.

      Show
      If infinispan is always accompanied with another resource, therefore forcing a 2PC this problem does not occur.

    Description

      I had been implementing infinispan clustered cache with transactions. When a rollback for a transaction occurs it will not push the changes to the local cache.

      However I found that if a one phase commit optimization is done where infispan is the one resource and a node in the clustered is found in a Suspect state while preparing that it will throw a RollbackException, which is great. However the local node still pushes the changes to the cache which then puts that local cache in an inconsistent state with the rest of the nodes in the cluster, since they were not pushed the updates.

      Here is a snippet of the trace when this occurs
      1 ider T 10:18:52,698 (Transactio) [0] Committing transaction GlobalTransaction:<wburns-53118>:6:local
      1 ider T 10:18:52,698 (Transactio) [0] Doing an 1PC prepare call on the interceptor chain
      1 ider T 10:18:52,698 (Invocation) [0] Invoked with command PrepareCommand {gtx=GlobalTransaction:<wburns-53118>:6:local, modifications=[RemoveCommand

      {key=key, value=null, flags=null}], onePhaseCommit=true, gtx=GlobalTransaction:<wburns-53118>:6:local, cacheName='test'} and InvocationContext [LocalTxInvocationContext{flags=null}]
      1 ider T 10:18:52,698 (CallInterc) [0] Suppressing invocation of method handlePrepareCommand.
      1 ider T 10:18:52,698 (Distributi) [0] Not performing invalidation! isL1CacheEnabled? true numCallRecipients=2
      1 ider T 10:18:52,699 (RpcManager) [0] wburns-53118 broadcasting call PrepareCommand {gtx=GlobalTransaction:<wburns-53118>:6:local, modifications=[RemoveCommand{key=key, value=null, flags=null}

      ], onePhaseCommit=true, gtx=GlobalTransaction:<wburns-53118>:6:local, cacheName='test'} to recipient list [wburns-38125, wburns-53118]
      1 ider T 10:18:52,699 (JGroupsTra) [0] dests=[wburns-38125, wburns-53118], command=PrepareCommand {gtx=GlobalTransaction:<wburns-53118>:6:local, modifications=[RemoveCommand

      {key=key, value=null, flags=null}], onePhaseCommit=true, gtx=GlobalTransaction:<wburns-53118>:6:local, cacheName='test'}, mode=SYNCHRONOUS, timeout=15000
      1 ider T 10:18:52,701 (CommandAwa) [0] Replication task sending PrepareCommand {gtx=GlobalTransaction:<wburns-53118>:6:local, modifications=[RemoveCommand{key=key, value=null, flags=null}

      ], onePhaseCommit=true, gtx=GlobalTransaction:<wburns-53118>:6:local, cacheName='test'} to addresses [wburns-38125, wburns-53118]
      1 ider T 10:18:52,703 (JBossMarsh) [0] Start marshaller after retrieving marshaller from thread local
      1 ider T 10:18:52,703 (VersionAwa) [0] Wrote version 420
      1 ider T 10:18:52,704 (JBossMarsh) [0] Stop marshaller
      1 ider T 10:18:52,704 (CommandAwa) [0] real_dests=[wburns-38125]
      37 ider T 10:18:52,720 (JGroupsDis) Closing joinInProgress gate
      37 ider I 10:18:53,028 (JGroupsTra) Received new cluster view: [wburns-53118|2] [wburns-53118]
      1 ider T 10:18:53,032 (CommandAwa) [0] Responses: [sender=wburns-38125, retval=null, received=false, suspected=true]

      37 ider T 10:18:53,028 (Transactio) Saw 1 leavers - kicking off a lock breaking task
      1 ider T 10:18:53,034 (RpcManager) [0] replication exception:
      org.infinispan.remoting.transport.jgroups.SuspectException: Suspected member: wburns-38125
      .... full stack trace removed for clarity ....
      1 ider T 10:18:53,037 (DistLockin) [0] Number of entries in context: 1
      41 ider T 10:18:53,036 (Transactio) No global transactions pertain to originator(s) [wburns-38125] who have left the cluster.
      37 ider I 10:18:53,036 (Distributi) Detected a view change. Member list changed from [wburns-53118, wburns-38125] to [wburns-53118]
      41 ider T 10:18:53,038 (Transactio) Completed cleaning stale locks.
      37 ider I 10:18:53,038 (Distributi) This is a LEAVE event! Node wburns-38125 has just left
      1 ider T 10:18:53,038 (ReadCommit) [0] Updating entry (key=key removed=true valid=false changed=true created=false value=com.redprairie.moca.cache.Maybe@5bd6c768]
      1 ider T 10:18:53,039 (DistLockin) [0] Releasing lock on [key] for owner GlobalTransaction:<wburns-53118>:6:local
      37 ider T 10:18:53,039 (Distributi) Added new leaver wburns-38125, leavers list is [wburns-38125]
      1 ider T 10:18:53,039 (LockManage) [0] Attempting to unlock key
      1 ider E 10:18:53,039 (Invocation) [0] Execution error:
      org.infinispan.remoting.transport.jgroups.SuspectException: Suspected member: wburns-38125
      .... full stack trace removed for clarity ....
      1 ider T 10:18:53,041 (Invocation) [0] Transaction marked for rollback as exception was received.
      1 ider T 10:18:53,041 (jta ) [0] TransactionImple.getStatus
      1 ider E 10:18:53,041 (Transactio) [0] Error while processing 1PC PrepareCommand
      org.infinispan.remoting.transport.jgroups.SuspectException: Suspected member: wburns-38125
      .... full stack trace removed for clarity ....
      1 ider T 10:18:53,043 (jta ) [0] TransactionImple.equals
      37 ider T 10:18:53,040 (Distributi) wburns-53118 is looking for a new backup to replace leaver wburns-38125
      37 ider T 10:18:53,044 (Distributi) Leaver wburns-38125 main backup wburns-53118 is looking for another backup as well.
      37 ider T 10:18:53,044 (Distributi) Nodes that need new backups are: [wburns-53118, wburns-53118]
      37 ider T 10:18:53,044 (Distributi) This node won't receive state
      37 ider I 10:18:53,044 (Distributi) I wburns-53118 am participating in rehash, state providers [wburns-53118, wburns-53118], state receivers [wburns-53118, wburns-53118]
      1 ider W 10:18:53,045 (jta ) [0] ARJUNA-16039 onePhaseCommit on < formatId=131076, gtrid_length=29, bqual_length=28, tx_uid=0:ffff0a0302c6:ba1e:4dc805d1:16, node_name=1, branch_uid=0:ffff0a0302c6:ba1e:4dc805d1:17, eis_name=unknown eis name > (TransactionXaAdapter{localTransaction=LocalTransaction

      {remoteLockedNodes=null, isMarkedForRollback=false, transaction=TransactionImple < ac, BasicAction: 0:ffff0a0302c6:ba1e:4dc805d1:16 status: ActionStatus.COMMITTING >, xid=< formatId=131076, gtrid_length=29, bqual_length=28, tx_uid=0:ffff0a0302c6:ba1e:4dc805d1:16, node_name=1, branch_uid=0:ffff0a0302c6:ba1e:4dc805d1:17, eis_name=unknown eis name >}

      org.infinispan.transaction.xa.LocalTransaction@9c35300f}) failed with exception XAException.XAER_RMERR
      javax.transaction.xa.XAException
      .... full stack trace removed for clarity ....
      1 ider T 10:18:53,046 (arjuna ) [0] BasicAction::removeChildThread () action 0:ffff0a0302c6:ba1e:4dc805d1:16 removing TSThread:1
      1 ider T 10:18:53,046 (arjuna ) [0] BasicAction::removeChildThread () action 0:ffff0a0302c6:ba1e:4dc805d1:16 removing TSThread:1 result = true
      1 ider T 10:18:53,047 (arjuna ) [0] TransactionReaper::remove ( BasicAction: 0:ffff0a0302c6:ba1e:4dc805d1:16 status: ActionStatus.ABORTED )
      RollbackException encountered - ARJUNA-16053 Could not commit transaction.
      1 ider T 10:18:53,047 (jta ) [0] BaseTransaction.begin

      The significant line is

      1 ider T 10:18:53,038 (ReadCommit) [0] Updating entry (key=key removed=true valid=false changed=true created=false value=com.redprairie.moca.cache.Maybe@5bd6c768]

      where in the ReadCommited isolation level still pushed the remove call that should have been rolled back.

      Attachments

        Activity

          People

            mircea.markus Mircea Markus (Inactive)
            rpwburns William Burns (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: