Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-8492

Cannot cleanly shutdown server disconnected inbound/outbound connections

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • 11.0.0.Beta1
    • None
    • JMS
    • None
    • Hide

      Steps to reproduce:

      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git
      cd eap-tests-hornetq/scripts/
      git checkout master
      
      groovy -DEAP_VERSION=7.1.0.DR15 PrepareServers7.groovy
      export WORKSPACE=$PWD
      export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap
      export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap
      export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap
      export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap
      
      cd ../jboss-hornetq-testsuite/
      
      mvn clean test -Dtest=RemoteJcaTestCase#testRemoteJcaShutdownJmsAndThenMdbServer -DfailIfNoTests=false -Deap=7x | tee log
      
      
      Show
      Steps to reproduce: git clone git: //git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ git checkout master groovy -DEAP_VERSION=7.1.0.DR15 PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap cd ../jboss-hornetq-testsuite/ mvn clean test -Dtest=RemoteJcaTestCase#testRemoteJcaShutdownJmsAndThenMdbServer -DfailIfNoTests= false -Deap=7x | tee log

      Customer impact: Server will not be possible to cleanly shutdown and must be killed. No simple management operation command which EAP provides can be used. Similar issue was reported by customers/users in the past and had to be solved by support.

      Test Scenario:

      • Start server 1 with queues InQueue and OutQueue
      • Start server 2 with MDB consuming messages from InQueue and for each message sending 1 new message to OutQueue in XA transaction
      • Send messages to InQueue to server 1
      • When MDB is processing messages then cleanly shutdown server 1
      • Cleanly shutdown server 2 with MDB

      Result:
      Server 2 does not shutdown.

      Expected Result:
      Server 2 with MDB will shutdown fast ( < 1 min).

      Investigation:
      Thread dump (attached) on server 2 shows types of waiting threads:

      "Thread-6 (ActiveMQ-client-global-threads-304296137)" #193 daemon prio=5 os_prio=0 tid=0x00007fc63c6d4000 nid=0x924 waiting on condition [0x00007fc5d6e0c000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00000000fda5f5f0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
              at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.waitForFailOver(ChannelImpl.java:250)
              at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:356)
              - locked <0x00000000fda5f618> (a java.lang.Object)
              at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:318)
              at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.xaEnd(ActiveMQSessionContext.java:383)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.end(ClientSessionImpl.java:1173)
              at org.apache.activemq.artemis.ra.ActiveMQRAXAResource.end(ActiveMQRAXAResource.java:112)
              at org.apache.activemq.artemis.service.extensions.xa.ActiveMQXAResourceWrapperImpl.end(ActiveMQXAResourceWrapperImpl.java:81)
              at org.jboss.jca.core.tx.jbossts.XAResourceWrapperImpl.end(XAResourceWrapperImpl.java:118)
              at org.jboss.jca.core.tx.jbossts.XAResourceWrapperStatImpl.end(XAResourceWrapperStatImpl.java:96)
              at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelAbort(XAResourceRecord.java:337)
              at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:3023)
              at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:3002)
              at com.arjuna.ats.arjuna.coordinator.BasicAction.Abort(BasicAction.java:1674)
              - locked <0x00000000fdb54128> (a com.arjuna.ats.internal.jta.transaction.arjunacore.AtomicAction)
              at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.cancel(TwoPhaseCoordinator.java:124)
              at com.arjuna.ats.arjuna.AtomicAction.abort(AtomicAction.java:186)
              at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.rollbackAndDisassociate(TransactionImple.java:1298)
              at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.rollback(BaseTransaction.java:143)
              at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.rollback(BaseTransactionManagerDelegate.java:134)
              at org.wildfly.transaction.client.LocalTransaction.rollbackAndDissociate(LocalTransaction.java:99)
              at org.wildfly.transaction.client.ContextTransactionManager.rollback(ContextTransactionManager.java:73)
              at org.jboss.as.ejb3.inflow.MessageEndpointInvocationHandler.afterDelivery(MessageEndpointInvocationHandler.java:69)
              at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:497)
              at org.jboss.as.ejb3.inflow.AbstractInvocationHandler.handle(AbstractInvocationHandler.java:60)
              at org.jboss.as.ejb3.inflow.MessageEndpointInvocationHandler.doInvoke(MessageEndpointInvocationHandler.java:135)
              at org.jboss.as.ejb3.inflow.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:73)
              at org.jboss.qa.hornetq.apps.mdb.MdbWithRemoteOutQueueToContaniner1$$$endpoint1.afterDelivery(Unknown Source)
              at org.apache.activemq.artemis.ra.inflow.ActiveMQMessageHandler.onMessage(ActiveMQMessageHandler.java:314)
              at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1001)
              at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:49)
              at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1124)
              at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:101)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      and:

      "Thread-21 (ActiveMQ-client-global-threads-304296137)" #196 daemon prio=5 os_prio=0 tid=0x00007fc62c040800 nid=0x928 waiting on condition [0x00007fc5d6a08000]
         java.lang.Thread.State: TIMED_WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00000000fd4bc8a8> (a java.util.concurrent.CountDownLatch$Sync)
              at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
              at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
              at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQClientProtocolManager.waitOnLatch(ActiveMQClientProtocolManager.java:136)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.getConnectionWithRetry(ClientSessionFactoryImpl.java:819)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.reconnectSessions(ClientSessionFactoryImpl.java:744)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.failoverOrReconnect(ClientSessionFactoryImpl.java:614)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.handleConnectionFailure(ClientSessionFactoryImpl.java:504)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.access$500(ClientSessionFactoryImpl.java:72)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$DelegatingFailureListener.connectionFailed(ClientSessionFactoryImpl.java:1165)
              at org.apache.activemq.artemis.spi.core.protocol.AbstractRemotingConnection.callFailureListeners(AbstractRemotingConnection.java:67)
              at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.fail(RemotingConnectionImpl.java:207)
              at org.apache.activemq.artemis.spi.core.protocol.AbstractRemotingConnection.fail(AbstractRemotingConnection.java:215)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$CloseRunnable.run(ClientSessionFactoryImpl.java:996)
              at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:101)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

      Those threads should be notified of server shutdown and "gracefully" stop do what they're doing. In case 1st thread:
      at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.waitForFailOver(ChannelImpl.java:250)
      there is problem in code:

      failoverCondition.await();
      

      This should be silently interrupted when component is stopping.

      2nd thread has problem that while() cycle does not finish when component is stopping:

            while (clientProtocolManager.isAlive()) {
      ...
                     try {
                        if (clientProtocolManager.waitOnLatch(interval)) { 
      ...
               }
      

              jmesnil1@redhat.com Jeff Mesnil
              jmesnil1@redhat.com Jeff Mesnil
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: