Loading...

XML

Word

Printable

Type: Bug
Resolution: Duplicate
Priority: Blocker
Fix Version/s: None
Affects Version/s: 8.1.0.Beta
Component/s: Server, Transactions
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False
CDW devel_ack:
CDW docs_ack:
CDW pm_ack:
CDW qa_ack:
CDW release:
Target Release:

8.1.0.GA
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

In the messaging CI pipeline there are rare failures when server does not stop in 3 minute timeout. By investigating the thread dumps (see attached files) there seems to be ArjunaRecoveryManagerService preventing server to shutdown:

"MSC service thread 1-3" #21 prio=5 os_prio=0 cpu=424.05ms elapsed=365.49s tid=0x00007fd720002a50 nid=0x19995 in Object.wait()  [0x00007fd7dcb84000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@17.0.12/Native Method)
        - waiting on <no object reference available>
        at java.lang.Object.wait(java.base@17.0.12/Object.java:338)
        at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.shutdown(PeriodicRecovery.java:163)
        - locked <0x00000000d19998c0> (a java.lang.Object)
        at com.arjuna.ats.internal.arjuna.recovery.RecoveryManagerImple.stop(RecoveryManagerImple.java:152)
        at com.arjuna.ats.arjuna.recovery.RecoveryManager.terminate(RecoveryManager.java:186)
        - locked <0x00000000d1b45848> (a com.arjuna.ats.arjuna.recovery.RecoveryManager)
        at com.arjuna.ats.arjuna.recovery.RecoveryManager.terminate(RecoveryManager.java:167)
        at com.arjuna.ats.jbossatx.jta.RecoveryManagerService.stop(RecoveryManagerService.java:58)
        at org.jboss.as.txn.service.ArjunaRecoveryManagerService.stop(ArjunaRecoveryManagerService.java:174)
        - locked <0x00000000d15f2ea0> (a org.jboss.as.txn.service.ArjunaRecoveryManagerService)
        at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1671)
        at org.jboss.msc.service.ServiceControllerImpl$StopTask.execute(ServiceControllerImpl.java:1641)
        at org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1438)
        at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
        at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1990)
        at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
        at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
        at java.lang.Thread.run(java.base@17.0.12/Thread.java:842)

This usually happens in scenarios with crashing XA transactions like in:

   /**
    * 
    * @tpTestDetails Start two servers. Deploy InQueue and OutQueue to first.
    * Configure HornetQ RA on second sever to connect to first server. Send
    * messages to InQueue. Deploy MDB to 2nd server which reads messages
    * from InQueue and sends them to OutQueue.
    * @tpProcedure <ul>
    * <li>start first server with deployed InQueue and OutQueue</li>
    * <li>start second server which has configured HornetQ RA to connect to first server</li>
    * <li>start producer which sends messages to InQueue</li>
    * <li>deploy MDB do 2nd server which reads messages from InQueue and sends to OutQueue</li>
    * <li>start second server which has configured HornetQ RA to connect to first server</li>
    * <li>start producer which sends messages to InQueue</li>
    * <li>deploy MDB do 2nd server which reads messages from InQueue and sends to OutQueue</li>
    * <li>stop first server</li>
    * <li>undeploy mdb</li>
    * <li>stop second server</li>
    * </ul>
    * @tpPassCrit servers shutdown in 3 min timeout

Unfortunately failure is highly intermittent and there is no usable reproducer. (CI test which hit it is org.jboss.qa.hornetq.test.remote.jca.RemoteJcaTestCase.testRemoteJcaShutdownJmsUndeployMdbAndThenMdbServer from EAP QE messaging test suite)

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

node-2-thread-dump-before-kill-shutdown-sequence.txt
2025/02/11 9:30 AM
51 kB
Miroslav Novak
node-2-thread-dump-when-killed-shutdown-sequence.txt
2025/02/11 9:30 AM
51 kB
Miroslav Novak
server.log
2025/02/11 9:30 AM
643 kB
Miroslav Novak

is duplicated by

JBEAP-28830 NPE in race condition between a thread committing a transaction and another thread performing recovery

Resolved

Assignee:: Manuel Finelli

Reporter:: Miroslav Novak

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/02/11 9:29 AM

Updated:: 2025/02/18 5:22 PM

Resolved:: 2025/02/18 5:22 PM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates