Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-639

Lost large messages in cluster with JDBC storage

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • AMQ 7.0.2.GA
    • A-MQ 7.0.0.ER18
    • None
    • None
    • Hide
      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git
      cd eap-tests-hornetq/scripts/
      git checkout 5725308a2c13e7653ee06339da64bb2054c6759b
      groovy -DEAP_VERSION=7.1.0.DR16 PrepareServers7.groovy
      export WORKSPACE=$PWD
      export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap
      export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap
      export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap
      export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap
      
      cd ../jboss-hornetq-testsuite/
      
      mvn clean test -Dtest=JGroupsClusterTestCase#testStopStartCluster -DfailIfNoTests=false -Deap=7x -Dprepare.param.DATABASE=oracle12c -Dprepare.param.JDBC_STORE=true  | tee log
      
      Show
      git clone git: //git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ git checkout 5725308a2c13e7653ee06339da64bb2054c6759b groovy -DEAP_VERSION=7.1.0.DR16 PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap cd ../jboss-hornetq-testsuite/ mvn clean test -Dtest=JGroupsClusterTestCase#testStopStartCluster -DfailIfNoTests= false -Deap=7x -Dprepare.param.DATABASE=oracle12c -Dprepare.param.JDBC_STORE= true | tee log
    • AMQ Sprint 3

      Scenario:

      • There are two EAPs configured to be in cluster. Both of them use JDBC persistence-storage.
      • There are two producers which send mix of normal and large messages. Each producer sends messages to different EAP.
      • There are two receivers which receive messages each from different EAP.
      • In the middle of the test both EAPs are stopped and started again.
      • After the test it is checked if all sent messages were received by receivers.

      Expectation: All sent messages are received by receivers.
      Reality: Sometimes happens that few large messages are lost.

      Customer impact: If large messages are used in cluster topology where EAPs use JDBC persistence-storage, some messages may be lost.

      I found out that lost messages try to be redistributed or redelivered to second server. Server-A hands over the message to a cluster bridge which is then responsible for delivering the message to Server-B. When I tracked the message in server's trace logs I saw that only a first packet of large message is sent but I didn't saw any continuation packet. So I added additional log [1] to the ClientProducerImpl.largeMessageSendServer method. Then I saw in the trace logs that body of the message has 0 size. See [2].

      The same scenario passed 50x with file based journal.

      [1]

      if (logger.isTraceEnabled()) {
        logger.tracef("Body size of the message %s is %d", msgI, bodySize);
      }
      

      [2]

      09:35:44,314 TRACE [org.apache.activemq.artemis.core.client.impl.ClientProducerImpl] (Thread-5 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$3@1c3fdeb9-1117657917)) Body size of the message LargeServerMessage[messageID=1798,durable=true,userID=bd5ad501-1f84-11e7-9bd0-001b217d6db3,priority=4, timestamp=Wed Apr 12 09:34:23 EDT 2017,expiration=0, durable=true, address=jms.queue.testQueue0,properties=TypedProperties[__AMQ_CID=aa07c740-1f84-11e7-9bd0-001b217d6db3,count=285,color=RED,_AMQ_BRIDGE_DUP=[A80B 5B82 1F84 11E7 BB14 0015 178E 6E72 0000 0000 0000 0706),counter=616,_AMQ_DUPL_ID=4ecca8f2-c0dc-4adb-a1f0-9703d5a3d50d1492004063108,_AMQ_LARGE_SIZE=204800,_AMQ_ROUTE_TO=[0000 0000 0000 0016),bytesAsLongs(22]]]@2049207583 is 0
      

              mtaylor1@redhat.com Martyn Taylor (Inactive)
              mtaylor1@redhat.com Martyn Taylor (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: