Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-29288

Arjuna can prevent the server from shutting down

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Blocker Blocker
    • None
    • 8.1.0.Beta
    • Server, Transactions
    • None
    • False
    • None
    • False

      In the messaging CI pipeline there are rare failures when server does not stop in 3 minute timeout. By investigating the thread dumps (see attached files) there seems to be ArjunaRecoveryManagerService preventing server to shutdown:

      "MSC service thread 1-3" #21 prio=5 os_prio=0 cpu=424.05ms elapsed=365.49s tid=0x00007fd720002a50 nid=0x19995 in Object.wait()  [0x00007fd7dcb84000]
         java.lang.Thread.State: WAITING (on object monitor)
              at java.lang.Object.wait(java.base@17.0.12/Native Method)
              - waiting on <no object reference available>
              at java.lang.Object.wait(java.base@17.0.12/Object.java:338)
              at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.shutdown(PeriodicRecovery.java:163)
              - locked <0x00000000d19998c0> (a java.lang.Object)
              at com.arjuna.ats.internal.arjuna.recovery.RecoveryManagerImple.stop(RecoveryManagerImple.java:152)
              at com.arjuna.ats.arjuna.recovery.RecoveryManager.terminate(RecoveryManager.java:186)
              - locked <0x00000000d1b45848> (a com.arjuna.ats.arjuna.recovery.RecoveryManager)
              at com.arjuna.ats.arjuna.recovery.RecoveryManager.terminate(RecoveryManager.java:167)
              at com.arjuna.ats.jbossatx.jta.RecoveryManagerService.stop(RecoveryManagerService.java:58)
              at org.jboss.as.txn.service.ArjunaRecoveryManagerService.stop(ArjunaRecoveryManagerService.java:174)
              - locked <0x00000000d15f2ea0> (a org.jboss.as.txn.service.ArjunaRecoveryManagerService)
              at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1671)
              at org.jboss.msc.service.ServiceControllerImpl$StopTask.execute(ServiceControllerImpl.java:1641)
              at org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1438)
              at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
              at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1990)
              at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
              at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
              at java.lang.Thread.run(java.base@17.0.12/Thread.java:842)
       

      This usually happens in scenarios with crashing XA transactions like in:

         /**
          * 
          * @tpTestDetails Start two servers. Deploy InQueue and OutQueue to first.
          * Configure HornetQ RA on second sever to connect to first server. Send
          * messages to InQueue. Deploy MDB to 2nd server which reads messages
          * from InQueue and sends them to OutQueue.
          * @tpProcedure <ul>
          * <li>start first server with deployed InQueue and OutQueue</li>
          * <li>start second server which has configured HornetQ RA to connect to first server</li>
          * <li>start producer which sends messages to InQueue</li>
          * <li>deploy MDB do 2nd server which reads messages from InQueue and sends to OutQueue</li>
          * <li>start second server which has configured HornetQ RA to connect to first server</li>
          * <li>start producer which sends messages to InQueue</li>
          * <li>deploy MDB do 2nd server which reads messages from InQueue and sends to OutQueue</li>
          * <li>stop first server</li>
          * <li>undeploy mdb</li>
          * <li>stop second server</li>
          * </ul>
          * @tpPassCrit servers shutdown in 3 min timeout
      

      Unfortunately failure is highly intermittent and there is no usable reproducer. (CI test which hit it is org.jboss.qa.hornetq.test.remote.jca.RemoteJcaTestCase.testRemoteJcaShutdownJmsUndeployMdbAndThenMdbServer from EAP QE messaging test suite)

        1. node-2-thread-dump-before-kill-shutdown-sequence.txt
          51 kB
          Miroslav Novak
        2. node-2-thread-dump-when-killed-shutdown-sequence.txt
          51 kB
          Miroslav Novak
        3. server.log
          643 kB
          Miroslav Novak

              jfinelli@redhat.com Manuel Finelli
              mnovak1@redhat.com Miroslav Novak
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: