Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-10462

OOM on client connected to divertedQueue

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • 7.1.0.ER3
    • 7.1.0.DR16, 7.1.0.DR17, 7.1.0.DR18, 7.1.0.ER1
    • ActiveMQ
    • None
    • Regression
    • Hide

      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git

      cd eap-tests-hornetq/scripts/

      git checkout master

      groovy -DEAP_VERSION=7.1.0.DR16 PrepareServers7.groovy

      export WORKSPACE=$PWD

      export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap

      export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap

      export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap

      export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap

      cd ../jboss-hornetq-testsuite/

      bash ./repeat_test_until_fail.sh ReplicatedDedicatedFailoverTestCase#testFailoverWithDivertsTransAckQueueOnlyMapMessagesShutdown

      Show
      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ git checkout master groovy -DEAP_VERSION=7.1.0.DR16 PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap cd ../jboss-hornetq-testsuite/ bash ./repeat_test_until_fail.sh ReplicatedDedicatedFailoverTestCase#testFailoverWithDivertsTransAckQueueOnlyMapMessagesShutdown
    • AMQ Sprint 3

      Customers impact: Customers won't be able to use diverts in HA topology, because clients may fall on OMM intermittently.

      We have following scenario:

      • start two nodes in dedicated cluster topology with divert directed to divertQueue from testTopic
      • start sending mixed messages (large, regular and all different types ) to testTopic on node-1 and receiving them from testTopic on node-1
      • during sending and receiving kill node-1
      • clients make failover on backup and continue in sending and receiving messages
      • stop producer and consumer
      • start receiver on divertQueue (also on backup) and wait for him to finish
      • verify messages

      Our problem is with receiver on divertQueue. Sometimes it happens that client fails due to OOM.

      Exception in thread "Thread-37" java.lang.OutOfMemoryError: Java heap space
      	at org.apache.activemq.artemis.core.buffers.impl.ChannelBufferWrapper.readSimpleStringInternal(ChannelBufferWrapper.java:92)
      	at org.apache.activemq.artemis.core.buffers.impl.ChannelBufferWrapper.readSimpleString(ChannelBufferWrapper.java:87)
      	at org.apache.activemq.artemis.utils.TypedProperties$StringValue.<init>(TypedProperties.java:858)
      	at org.apache.activemq.artemis.utils.TypedProperties$StringValue.<init>(TypedProperties.java:849)
      	at org.apache.activemq.artemis.utils.TypedProperties.decode(TypedProperties.java:397)
      	at org.apache.activemq.artemis.reader.MapMessageUtil.readBodyMap(MapMessageUtil.java:46)
      	at org.apache.activemq.artemis.jms.client.ActiveMQMapMessage.doBeforeReceive(ActiveMQMapMessage.java:344)
      	at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.getMessage(ActiveMQMessageConsumer.java:224)
      	at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.receive(ActiveMQMessageConsumer.java:132)
      	at org.jboss.qa.hornetq.apps.clients.Receiver11.receiveMessage(Receiver11.java:140)
      	at org.jboss.qa.hornetq.apps.clients.ReceiverTransAck.run(ReceiverTransAck.java:83)
      

      From what I found there is huge value passed by buffer.readInt(); in ChannelBufferWrapper, so clients try to create byte array bigger than max heap size allows. In my case buffer.readInt() returned 2046851584.
      To get to this value I used following byteman rule:

      RULE print integer
      CLASS org.apache.activemq.artemis.core.buffers.impl.ChannelBufferWrapper
      METHOD readSimpleStringInternal
      AFTER WRITE $len
      BIND myLen:int=$len
      IF myLen > 2048
      DO System.out.println("Len size is: " + myLen);
      ENDRULE
      

      In this test maximum size of body of large message is 200KB. All clients use transacted sessions. Also all messages which are supposed to be consumed from divertedQueue was already consumed from original topic, where isn't any problem with OOM.
      in attached zip file node-1 is live server and node-2 backup

      This seems to be regression against DR15 build, but we do not have 100% reproducer

            gaohoward Howard Gao
            okalman@redhat.com Ondřej Kalman (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: