Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-3089

Failback fails in colocated HA toplogy with shared store and netty connectors

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • 7.0.0.ER7
    • 7.0.0.ER4
    • ActiveMQ
    • Regression
    • Hide

      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git

      cd eap-tests-hornetq/scripts/

      git checkout refactoring_modules

      groovy -DEAP_VERSION=7.0.0.ER4 PrepareServers7.groovy

      export WORKSPACE=$PWD

      export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap

      export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap

      export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap

      export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap

      mkdir journal_A
      mkdir journal_B
      export JOURNAL_DIRECTORY_A=$WORKSPACE/journal_A
      export JOURNAL_DIRECTORY_B=$WORKSPACE/journal_B

      cd ../jboss-hornetq-testsuite/

      mvn clean test -Dtest=ColocatedClusterFailoverTestCase#testFailbackClientAckQueueNIO -DfailIfNoTests=false -Deap=7x | tee log

      Show
      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ git checkout refactoring_modules groovy -DEAP_VERSION=7.0.0.ER4 PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap mkdir journal_A mkdir journal_B export JOURNAL_DIRECTORY_A=$WORKSPACE/journal_A export JOURNAL_DIRECTORY_B=$WORKSPACE/journal_B cd ../jboss-hornetq-testsuite/ mvn clean test -Dtest=ColocatedClusterFailoverTestCase#testFailbackClientAckQueueNIO -DfailIfNoTests=false -Deap=7x | tee log

      We have manually (like EAP6) configured colocated topology with shared journal.
      Client is connected to server1 and sends messages to queue. While sending, server1 is killed when "processRoute" method is invoked in PostOfficeImpl on server1. Client successfully executes failover sequence to backup. Then server1 is started again.

      Client is stuck (probably on send() ) method.
      Server1: Live obtains live lock sucesfully, backup obtains backup log sucessfully and is announced, cluster is connected. Log starts spamming messages:
      13:53:33,730 WARN [org.apache.activemq.artemis.core.client] (Thread-0 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$2@70f35326-1682410692)) AMQ212006: Waiting 2,000 milliseconds before next retry. RetryInterval=2,000 and multiplier=1

      Server2: Live and backup are announced and obtain their locks. And in some cases starts spaming log with: 13:47:40,718 WARN [org.apache.activemq.artemis.core.client] (Thread-20 (ActiveMQ-client-global-threads-785057483)) AMQ212006: Waiting 2,000 milliseconds before next retry. RetryInterval=2,000 and multiplier=1

      Here are thread dumps from client: [dump-client.txt] and from server1: [dump-server.txt]

      In attachment are configs of both servers.

        1. standalone-full-ha-2.xml
          31 kB
        2. standalone-full-ha.xml
          31 kB
        3. dump-client.txt
          33 kB
        4. dump-server.txt
          222 kB
        5. traces.zip
          2.36 MB
        6. logs_3089.zip
          4.20 MB

              mtaylor1@redhat.com Martyn Taylor (Inactive)
              okalman@redhat.com Ondřej Kalman (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: