-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
None
-
False
-
None
-
False
-
-
This Jira aims to improve the fix provided with https://issues.redhat.com/browse/AG-135
While testing a corner case (XA Transaction Recovery) an issue during the Recovery of an XA Transaction led to the following error:
2023-11-22 15:21:17,102 DEBUG [org.jbo.jca.cor.tx.jbo.XAResourceRecoveryImpl] (Periodic Recovery) Recovery XAResource=XAResourceWrapperImpl@6248ffa3[xaResource=com.unisys.dtp.connector.DtpXAResource@5688ffd2 pad=false overrideRmValue=null productName=DTP Connector productVersion=HMP 15.2 jndiName=null] for null 2023-11-22 15:21:17,103 DEBUG [com.arj.ats.jta] (Periodic Recovery) xarecovery of NameScopedXAResource{xaResource=io.agroal.narayana.RecoveryXAResource@52e3b2a8, jndiName='null'} 2023-11-22 15:21:17,103 FINE [com.mic.sql.jdb.int.SQLServerStatement] (Periodic Recovery) SQLServerCallableStatement:4 created by (ConnectionID:6 ClientConnectionId: dd631c3a-2c81-4d7a-b2d9-b6a11c401ba0) 2023-11-22 15:21:17,104 FINE [com.mic.sql.jdb.int.SQLServerStatement] (Periodic Recovery) SQLServerCallableStatement:4: calling sp_executesql: SQL:EXEC master..xp_sqljdbc_xa_recover @P0 OUT, @P1 OUT, @P2, @P3 OUT 2023-11-22 15:21:17,174 DEBUG [com.arj.ats.jta] (Periodic Recovery) Found 1 xids in doubt 2023-11-22 15:21:17,175 DEBUG [com.arj.ats.jta] (Periodic Recovery) Recovered: < 131077, 35, 36, 0000000000-1-1-64-88295500-532910193-50-94000911311797114107117115, 0000000000-1-1-64-88295500-532910193-50-940001500000000 > 2023-11-22 15:21:17,179 DEBUG [com.arj.ats.jta] (Periodic Recovery) xarecovery of NameScopedXAResource{xaResource=XAResourceWrapperImpl@6248ffa3[xaResource=com.unisys.dtp.connector.DtpXAResource@5688ffd2 pad=false overrideRmValue=null productName=DTP Connector productVersion=HMP 15.2 jndiName=null], jndiName='null'} 2023-11-22 15:21:17,190 DEBUG [com.arj.ats.jta] (Periodic Recovery) Found 1 xids in doubt 2023-11-22 15:21:17,191 DEBUG [com.arj.ats.jta] (Periodic Recovery) Recovered: < 131077, 35, 36, 0000000000-1-1-64-88295500-532910193-50-94000911311797114107117115, 0000000000-1-1-64-88295500-532910193-50-940001100000000 > 2023-11-22 15:21:17,192 FINE [com.mic.sql.jdb.int.SQLServerStatement] (Periodic Recovery) SQLServerCallableStatement:4: calling sp_prepexec: PreparedHandle:0, SQL:EXEC master..xp_sqljdbc_xa_recover @P0 OUT, @P1 OUT, @P2, @P3 OUT 2023-11-22 15:21:17,414 DEBUG [com.arj.ats.jta] (Periodic Recovery) XARecoveryModule state change FIRST_PASS->IDLE 2023-11-22 15:21:17,418 DEBUG [com.arj.ats.arjuna] (Periodic Recovery) RecoverAtomicAction.replayPhase2 recovering 0:ffffc0a81d37:cb1d:655dcea2:9 ActionStatus is ActionStatus.COMMITTED 2023-11-22 15:21:19,566 FINE [com.mic.sql.jdb.int.SQLServerException] (Periodic Recovery) *** SQLException: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed. The connection is closed. 2023-11-22 15:21:19,568 WARN [com.arj.ats.jta] (Periodic Recovery) ARJUNA016036: commit on < formatId=131077, gtrid_length=35, bqual_length=36, tx_uid=0:ffffc0a81d37:cb1d:655dcea2:9, node_name=quarkus, branch_uid=0:ffffc0a81d37:cb1d:655dcea2:f, subordinatenodename=null, eis_name=0 > (io.agroal.narayana.RecoveryXAResource@52e3b2a8) failed with exception $XAException.XAER_RMERR: javax.transaction.xa.XAException: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed. at com.microsoft.sqlserver.jdbc.SQLServerXAResource.DTC_XA_Interface(SQLServerXAResource.java:757) at com.microsoft.sqlserver.jdbc.SQLServerXAResource.commit(SQLServerXAResource.java:826) at io.agroal.narayana.RecoveryXAResource.commit(RecoveryXAResource.java:40)
A detailed analysis has been provided by Jonathan Halliday which I'm copying here
"Agroal appears to be aiming at: "TMENDSCAN is the last call in the scan
sequence. If there is nothing to recover, we'll never be called again,
so we can clean up". But that's subltly flawed: recovery scans work
like an iterator or cursor, because they were designed by db people.
TMSTARTSCAN may return results, then TMENDSCAN may return none, not
because there are none overall, but because they have already been
returned by the TMSTARTSCAN. The behaviour it actually needs is: for
each recover call, set a marker if the result set is non-empty. When
TMENDSCAN is used, close if the marker has not previously been set. That
is, it needs memory of the whole cycle, not just the most recent call.
Interesting trivia: this is not exposed when testing on mysql, because
mysql ignores the flags and returns the whole result set regardless.
mssqlserver, on the other hand...
Anyhow, the intended behaviour of Agroal would still leak, because where
there are things to recover it can't reason about when the XAResource
will be disposed, so it can't close the connection. There is no
guarantee made that all recovery activity will take place before
TMENDSCAN. But having things to recover is rare, whilst scanning for
them is common (every few minutes) so at least this slows the leaking to
a tolerable level."
- account is impacted by
-
AG-238 XA recovery connection leaks
- Closed