Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-2148

Provider-window-size causes queue to be not responsive

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Major Major
    • None
    • None
    • None
    • None
    • Hide

      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git
      cd eap-tests-hornetq/scripts/
      git checkout master
      groovy -DEAP_VERSION=7.1.0.CR3 PrepareServers7.groovy
      export WORKSPACE=$PWD
      export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap
      export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap
      export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap
      export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap

      cd ../jboss-hornetq-testsuite/

      mvn clean test -Dtest=ClusterTestRedistributionToExhaustedServerTestCase#testOOMWhenTargetServerRespondsSlowly -DfailIfNoTests=false -Deap=7x | tee log

      Show
      git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ git checkout master groovy -DEAP_VERSION=7.1.0.CR3 PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap cd ../jboss-hornetq-testsuite/ mvn clean test -Dtest=ClusterTestRedistributionToExhaustedServerTestCase#testOOMWhenTargetServerRespondsSlowly -DfailIfNoTests=false -Deap=7x | tee log

      There is an issue in EAP https://issues.jboss.org/browse/JBEAP-13599, about OutOfMemory exception during cluster communication and resend cache.
      Problem is solvable by use property producer-window-size, which was prepared in PR https://github.com/rh-messaging/jboss-activemq-artemis/pull/247.

      After upgrade of artemis to version 2.6.3, fix is no longer working.

      Scenario

      There are two Artemis servers in cluster
      Server 1 has messages in a queue but it has no consumer (source server)
      Server 2 has consumer (target server)
      Since target server has consumer, messages are redistributed from the source server to the target server
      Target server responds slower that the source server sends messages. This may be caused by slow IO operations or by exhausted CPU.

      Expectation: All messages are redistributed from the source to target server and consumed by the consumer.

      Reality: Some messages are lost during redistribution.

      Without producer-window-size, scenario ends with OutOfMemory exception on Server 1.
      With producer-window0size set to 50000 (patch to modify test to use this property is attached) test ends with not delivered messages.

      After investigation it seems, that producer-window-size (p-w-s) exhausts queue:

      • I'm able to find following message in test log:

        WARN [org.apache.activemq.artemis.core.server] (Thread-0 (-scheduled-threads)) AMQ224081: The component QueueImpl[name=jms.queue.testQueue, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=6fd86bc3-ecb5-11e8-8048-e4b3187cd715], temp=false]@5112cd78 is not responsive

      • following dump thread shows, that almost all threads are waiting (thread dump is also attached)
      • it seems, that p-w-s is working, because I'm able to log block requests (by usage of some additional logging)
      • sometimes test scenario ends with OutOfMemory exception

      another fact (which can be related):
      If byteman rule (simulating slower response) is removed, number of messages is set to 30000 (instead of 120000)

      • test scenario without p-w-s ends successfully
      • test scenario with p-w-s ends with out of memory exception - which seems, like p-s-w is blocking a lot of parallel threads

              rh-ee-ataylor Andy Taylor
              jondruse@redhat.com Jiri Ondrusek
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: