Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
5.11.4.Final
-
None
Description
We observed the following errors in our logs.
2023-03-14T07:00:46,556Z [/] WARN [com.arjuna.ats.arjuna] (Periodic Recovery) ARJUNA012087: Activate of atomic action with id 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede and type /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/ SubordinateAtomicAction/JCA unexpectedly failed, could not load state. 2023-03-14T07:00:46,556Z [/] FATAL [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016131: Subordinate transaction was not recovered successfully 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede: java.lang.IllegalArgumentException: ARJUNA016147: Cannot recover imported transaction of UID '0:ffffac18fa20:16e3c88c:64071ef9:371f5ede' of transaction 'TransactionImple < jca-subordinate, BasicAction: 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede status: ActionStatus.CREATED >' as transaction base Xid is null. at org.jboss.jts//com.arjuna.ats.internal.jta.transaction.arjunacore.jca.TransactionImporterImple.recoverTransaction(TransactionImporterImple.java:129) at org.jboss.jts//com.arjuna.ats.internal.jta.transaction.arjunacore.jca.TransactionImporterImple.recoverTransaction(TransactionImporterImple.java:52) at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.SubordinateAtomicActionRecoveryModule.periodicWorkFirstPass(SubordinateAtomicActionRecoveryModule.java:74) at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770) at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
To make long story short, it happens when Periodic Recovery attempts to recover a transaction which is commited during the loop in Periodic Recovery. We've noticed it only when using HornetqJournalStore (jounral store enabled in TM settings in JBoss EAP7)
More details below.
The broader context of logs contains:
2023-03-14T07:00:46,493Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::doCommit() result for action-id (0:ffffac18fa20:16e3c88c:64071ef9:371f5ede) on record id: (0:ffffac18fa20:16e3c88c:64071ef9:371f5ee5) is (TwoPha seOutcome.FINISH_OK) node id: (c56dfce3359fe9c3a1eef4) 2023-03-14T07:00:46,493Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::updateState() for action-id 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede 2023-03-14T07:00:46,493Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) HornetqObjectStore.remove_committed(0:ffffac18fa20:16e3c88c:64071ef9:371f5ede, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/SubordinateAtomicAc tion/JCA) 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] result = false 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 result = true 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] result = false 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 2023-03-14T07:00:46,494Z [/] INFO [org.apache.activemq.transport.failover.FailoverTransport] (ActiveMQ Task-1) Successfully connected to ssl://cert1-amq1.iaws.s4a.aero:61617?soTimeout=10000&soWriteTimeout=10000 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 result = true 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] result = false 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 result = true 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] result = false 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 result = true 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] result = false 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 result = true 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] result = false 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 result = true 2023-03-14T07:00:46,494Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] 2023-03-14T07:00:46,495Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::addChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede adding Thread[default task-21831,5,main] result = false 2023-03-14T07:00:46,495Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 2023-03-14T07:00:46,495Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) BasicAction::removeChildThread () action 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede removing 13996 result = true 2023-03-14T07:00:46,495Z [/] TRACE [com.arjuna.ats.arjuna] (default task-21831) TransactionReaper::remove ( BasicAction: 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede status: ActionStatus.COMMITTED ) 2023-03-14T07:00:46,556Z [/] TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) HornetqObjectStore.read_committed(0:ffffac18fa20:16e3c88c:64071ef9:371f5ede, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/SubordinateAtomicActio n/JCA) 2023-03-14T07:00:46,556Z [/] WARN [com.arjuna.ats.arjuna] (Periodic Recovery) ARJUNA012087: Activate of atomic action with id 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede and type /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/ SubordinateAtomicAction/JCA unexpectedly failed, could not load state. 2023-03-14T07:00:46,556Z [/] FATAL [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016131: Subordinate transaction was not recovered successfully 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede: java.lang.IllegalArgumentException: ARJUNA016147: Cannot recover imported transaction of UID '0:ffffac18fa20:16e3c88c:64071ef9:371f5ede' of transaction 'TransactionImple < jca-subordinate, BasicAction: 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede status: ActionStatus.CREATED >' as transaction base Xid is null. at org.jboss.jts//com.arjuna.ats.internal.jta.transaction.arjunacore.jca.TransactionImporterImple.recoverTransaction(TransactionImporterImple.java:129) at org.jboss.jts//com.arjuna.ats.internal.jta.transaction.arjunacore.jca.TransactionImporterImple.recoverTransaction(TransactionImporterImple.java:52) at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.SubordinateAtomicActionRecoveryModule.periodicWorkFirstPass(SubordinateAtomicActionRecoveryModule.java:74) at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770) at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
In my opinion the reason is as follows
com.arjuna.ats.internal.jta.recovery.arjunacore.SubordinateAtomicActionRecoveryModule#periodicWorkFirstPass
calls
recoveryStore.allObjUids
In our case HornetqObjectStoreAdaptor implementation is used. So it fetches all UIDs from the store with type: StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/ SubordinateAtomicAction/JCA
(when you dive deeper data is taken from the map com.arjuna.ats.internal.arjuna.objectstore.hornetq.HornetqJournalStore#content)
After all UIDs are fetched the while loop starts iterating over all retrieved uids.
if (recoveryStore.allObjUids(SubordinateAtomicAction.getType(), states) && (states.notempty())) { while (true) { uid = UidHelper.unpackFrom(states); if (uid.notEquals(Uid.nullUid())) { SubordinationManager .getTransactionImporter().recoverTransaction( uid); } else { break; } } }
Then HornetqObjectStore.remove_committed is executed for one of the UID because transaction ends in another thread (as shown in logs above).
And of course remove_committed method removes UID from store (HornetqJournalStore:156) (uid is removed from the map com.arjuna.ats.internal.arjuna.objectstore.hornetq.HornetqJournalStore#content )
RecordInfo record = getContentForType(typeName).remove(uid);
As the result
com.arjuna.ats.internal.jta.transaction.arjunacore.jca.TransactionImporterImple#recoverTransaction
is invoked for actId which is removed from the store and
IllegalArgumentException is thrown in line 129 of com.arjuna.ats.internal.jta.transaction.arjunacore.jca.TransactionImporterImple
because the constructor of
TransactionImple calls
com.arjuna.ats.internal.arjuna.objectstore.hornetq.HornetqJournalStore#read_committed
which return null for given uid:
RecordInfo record = getContentForType(typeName).get(uid); if(record == null) { return null; }
This is also confirmed by another log which can be observed:
Activate of atomic action with id 0:ffffac18fa20:16e3c88c:64071ef9:371f5ede and type /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/ SubordinateAtomicAction/JCA unexpectedly failed, could not load state.
So in another words
1) Periodic Recovery fails in the loop which calls 'recoverTransaction' for each UIDs (taken from the store/map)
2) this fails becasue
recovered.baseXid() is null
3) baseXid is null because it is set in
activate method
4) active method fails because
read_committed retuns null
5) read_commtted returns null beacue
getContentForType(typeName).get(uid) return null
5)
getContentForType(typeName).get(uid) returns null because the store/map doesn't contain the uid because it was removed after transaction commit in another thread.