Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-11082

Deadlock in replicated HA test case

XMLWordPrintable

    • Blocks Testing
    • Hide
      git clone http://github.com/rh-messaging/jboss-activemq-artemis
      cd jboss-activemq-artemis
      git checkout 1.5.5.jbossorg-002
      
      mvn clean install -Ptests -DfailIfNoTests=false -Drat.ignoreErrors=true -Dtest=ReplicatedLargeMessageWithDelayFailoverTest | tee log
      
      Show
      git clone http: //github.com/rh-messaging/jboss-activemq-artemis cd jboss-activemq-artemis git checkout 1.5.5.jbossorg-002 mvn clean install -Ptests -DfailIfNoTests= false -Drat.ignoreErrors= true -Dtest=ReplicatedLargeMessageWithDelayFailoverTest | tee log

      Scenario: There are two Artemis servers configured as replicated Live-Backup pair. During the initial synchronization both servers are stopped.

      Expectation: Both servers are successfully stopped.

      Reality: Stopping phase of the backup hangs because of deadlock [1].

      Customer impact: There is a risk that two Java threads may get into the deadlock. Such server will hang and it will be unable to stop.

      In an attachment you can find the entire thread dump.

      [1]

      Found one Java-level deadlock:
      =============================
      "Thread-0 (org.apache.activemq.artemis.core.remoting.impl.invm.InVMConnector-3817308)":
        waiting to lock monitor 0x00877690 (object 0x404a3ea8, a org.apache.activemq.artemis.core.replication.ReplicationEndpoint),
        which is held by "AMQ119000: Activation for server ActiveMQServerImpl::ReplicatedLargeMessageWithDelayFailoverTest/backupServers"
      "AMQ119000: Activation for server ActiveMQServerImpl::ReplicatedLargeMessageWithDelayFailoverTest/backupServers":
        waiting for ownable synchronizer 0x404ca8b0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
        which is held by "Thread-0 (org.apache.activemq.artemis.core.remoting.impl.invm.InVMConnector-3817308)"
      
      Java stack information for the threads listed above:
      ===================================================
      "Thread-0 (org.apache.activemq.artemis.core.remoting.impl.invm.InVMConnector-3817308)":
              at org.apache.activemq.artemis.core.replication.ReplicationEndpoint.registerJournal(ReplicationEndpoint.java:143)
              - waiting to lock <0x404a3ea8> (a org.apache.activemq.artemis.core.replication.ReplicationEndpoint)
              at org.apache.activemq.artemis.core.replication.ReplicationEndpoint.finishSynchronization(ReplicationEndpoint.java:342)
              at org.apache.activemq.artemis.core.replication.ReplicationEndpoint.handleStartReplicationSynchronization(ReplicationEndpoint.java:435)
              at org.apache.activemq.artemis.core.replication.ReplicationEndpoint.handlePacket(ReplicationEndpoint.java:195)
              at org.apache.activemq.artemis.tests.integration.cluster.util.BackupSyncDelay$ReplicationChannelHandler.handlePacket(BackupSyncDelay.java:183)
              - locked <0x404a3f18> (a org.apache.activemq.artemis.tests.integration.cluster.util.BackupSyncDelay$ReplicationChannelHandler)
              at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.handlePacket(ChannelImpl.java:623)
              at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.doBufferReceived(RemotingConnectionImpl.java:379)
              - locked <0x404a3f38> (a java.lang.Object)
              at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:362)
              at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$DelegatingBufferHandler.bufferReceived(ClientSessionFactoryImpl.java:1143)
              at org.apache.activemq.artemis.core.remoting.impl.invm.InVMConnection$1.run(InVMConnection.java:196)
              at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:101)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      "AMQ119000: Activation for server ActiveMQServerImpl::ReplicatedLargeMessageWithDelayFailoverTest/backupServers":
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x404ca8b0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
              at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
              at org.apache.activemq.artemis.core.journal.impl.JournalImpl.stop(JournalImpl.java:2332)
              - locked <0x404ca060> (a org.apache.activemq.artemis.core.journal.impl.JournalImpl)
              at org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.stop(JournalStorageManager.java:244)
              - locked <0x404c9c90> (a org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager)
              at org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.stop(JournalStorageManager.java:181)
              at org.apache.activemq.artemis.core.replication.ReplicationEndpoint.stop(ReplicationEndpoint.java:319)
              - locked <0x404a3ea8> (a org.apache.activemq.artemis.core.replication.ReplicationEndpoint)
              at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stopComponent(ActiveMQServerImpl.java:1110)
              at org.apache.activemq.artemis.core.server.impl.SharedNothingBackupActivation.run(SharedNothingBackupActivation.java:252)
              at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$ActivationThread.run(ActiveMQServerImpl.java:2491)
      
      Found 1 deadlock.
      

        1. threaddump
          35 kB
          Erich Duda
        2. threaddump-jbossorg-004
          41 kB
          Erich Duda

              mtaylor1@redhat.com Martyn Taylor (Inactive)
              eduda_jira Erich Duda (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: