Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-14175

Core bridge does not reconnect when target node is restarted

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Blocker Blocker
    • None
    • 7.2.0.GA
    • JMS
    • Hide
      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git
      cd eap-tests-hornetq/scripts/
      groovy -DEAP_ZIP_URL=https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/early-testing-messaging-prepare/136//artifact/jboss-eap.zip PrepareServers7.groovy
      export WORKSPACE=$PWD
      export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap
      export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap
      export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap
      export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap
      
      cd ../jboss-hornetq-testsuite/
      
      mvn clean test -Dtest=Lodh4TestCase#testFailOfOneServer -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1517906210-SNAPSHOT -DfailIfNoTests=false -Deap=7x | tee log
      
      Show
      git clone git: //git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ groovy -DEAP_ZIP_URL=https: //eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/early-testing-messaging-prepare/136//artifact/jboss-eap.zip PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap cd ../jboss-hornetq-testsuite/ mvn clean test -Dtest=Lodh4TestCase#testFailOfOneServer -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1517906210-SNAPSHOT -DfailIfNoTests= false -Deap=7x | tee log

      There is regression in scenario with reconnecting core bridge. It was introduced by adding line [1] to BridgeImpl.

      Test scenario:

      • Start server 1 and server 2 with core bridge deployed on server1
      • Cleanly shutdown server2
      • Start server 2 again, it's expected that core bridge will reconnect

      Result: Core bridge does not reconnect. After server2 is cleanly shutdown there is NPE on server1:

      12:05:37,314 WARN  [org.apache.activemq.artemis.core.server] (Thread-0 (ActiveMQ-client-global-threads)) AMQ222095: Connection failed with failedOver=false
      12:05:37,316 ERROR [org.apache.activemq.artemis.core.client] (Thread-0 (ActiveMQ-client-global-threads)) AMQ214002: Failed to execute failure listener: java.lang.NullPointerException
      	at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) [rt.jar:1.8.0_131]
      	at org.apache.activemq.artemis.core.client.impl.Topology.removeMember(Topology.java:296) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.notifyNodeDown(ServerLocatorImpl.java:1426) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.fail(BridgeImpl.java:782) [artemis-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.connectionFailed(BridgeImpl.java:660) [artemis-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.callSessionFailureListeners(ClientSessionFactoryImpl.java:701) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.failoverOrReconnect(ClientSessionFactoryImpl.java:637) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.handleConnectionFailure(ClientSessionFactoryImpl.java:504) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.handleConnectionFailure(ClientSessionFactoryImpl.java:497) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.access$100(ClientSessionFactoryImpl.java:72) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$1.run(ClientSessionFactoryImpl.java:360) [artemis-core-client-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66) [artemis-commons-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
      	at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
      

      Investigation showed that line [1] is called with variable targetNodeID=null which is not initialized after bridge is connected to target node (server2). This NPE kill reconnection logic. I tried to remove line [1] and core bridge successfully reconnected.

      [1] https://github.com/apache/activemq-artemis/blob/b66d0f7ac40001cce14ca7146e74720504ff9eb1/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/cluster/impl/BridgeImpl.java#L782

              mtaylor1@redhat.com Martyn Taylor (Inactive)
              mnovak1@redhat.com Miroslav Novak
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: