Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Blocker
Component/s: Artemis
Labels:
- feature-branch-blocker

Steps to Reproduce:
Hide

git clone git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ groovy -DEAP_ZIP_URL=https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/EAP7/view/EAP7-JMS/view/early-testing/view/tooling/job/early-testing-messaging-prepare/257/artifact/jboss-eap.zip PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap cd ../jboss-hornetq-testsuite/ mvn clean test -Dtest=NetworkFailuresHornetQCoreBridges#testNetworkFailureSmallMessages -DfailIfNoTests=false -Deap=7x -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1521531853-SNAPSHOT | tee log or mvn clean test -Dtest=Lodh4TestCase#testFailOfOneServer -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1521531853-SNAPSHOT -DfailIfNoTests=false -Deap=7x | tee log or mvn clean test -Dtest=DedicatedFailoverCoreBridges#testFailbackKillWithBridgeWithStaticNIOConnectors -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1521531853-SNAPSHOT -DfailIfNoTests=false -Deap=7x | tee log
Show
git clone git: //git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git cd eap-tests-hornetq/scripts/ groovy -DEAP_ZIP_URL=https: //eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/EAP7/view/EAP7-JMS/view/early-testing/view/tooling/job/early-testing-messaging-prepare/257/artifact/jboss-eap.zip PrepareServers7.groovy export WORKSPACE=$PWD export JBOSS_HOME_1=$WORKSPACE/server1/jboss-eap export JBOSS_HOME_2=$WORKSPACE/server2/jboss-eap export JBOSS_HOME_3=$WORKSPACE/server3/jboss-eap export JBOSS_HOME_4=$WORKSPACE/server4/jboss-eap cd ../jboss-hornetq-testsuite/ mvn clean test -Dtest=NetworkFailuresHornetQCoreBridges#testNetworkFailureSmallMessages -DfailIfNoTests= false -Deap=7x -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1521531853-SNAPSHOT | tee log or mvn clean test -Dtest=Lodh4TestCase#testFailOfOneServer -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1521531853-SNAPSHOT -DfailIfNoTests= false -Deap=7x | tee log or mvn clean test -Dtest=DedicatedFailoverCoreBridges#testFailbackKillWithBridgeWithStaticNIOConnectors -Deap7.org.jboss.qa.hornetq.apps.clients.version=7.1521531853-SNAPSHOT -DfailIfNoTests= false -Deap=7x | tee log
Target Release:

7.2.0.GA
CDW blocker:
CDW devel_ack:

Scenario

There are two Artemis brokers configured to form cluster
There is a producer sending messages to broker 1 and receiver receiving messages from broker 2
Between the brokers there is a proxy which simulates network failure
The proxy is several times stopped and restarted to simulate the network failure
The test expects that all messages sent to broker 1 will be received by receiver from broker 2 (despite the network failures)

Reality: After the proxy is stopped and restarted, the cluster is not able to form again. Both brokers try to reconnect to their opposites but with no luck.

Customer scenario: Messaging cluster is not able to recover after network failures.

Investigation of issue

I investigated why brokers are not able to reconnect and I found out that always when they try to reconnect, they give it up because there is no topology record for nodeId where they try to connect. So the re-connection attempt ends here [1].

I compared the behavior with Artemis 1.x and I found out that Artemis 2.x removes the topology member when connection failure is detected, but Artemis 1.x doesn't. When I commented the line [2] it fixed the issue. This line is not present in 1.x.

[1] https://github.com/apache/activemq-artemis/blob/b66d0f7ac40001cce14ca7146e74720504ff9eb1/artemis-core-client/src/main/java/org/apache/activemq/artemis/core/client/impl/ServerLocatorImpl.java#L659
[2] https://github.com/apache/activemq-artemis/blob/b66d0f7ac40001cce14ca7146e74720504ff9eb1/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/cluster/impl/BridgeImpl.java#L782

blocks

WFLY-10320 Upgrade artemis from 1.5.x to 2.x.x

Closed

is cloned by

ENTMQBR-1078 Regression in cluster tests with network failures

Closed

is duplicated by

JBEAP-14175 Core bridge does not reconnect when target node is restarted

Closed

is related to

JBEAP-14175 Core bridge does not reconnect when target node is restarted

Closed

is caused by: ARTEMIS-1790 Loading...

Assignee:: Howard Gao

Reporter:: Erich Duda (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2018/01/29 7:42 AM

Updated:: 2021/10/24 5:50 AM

Resolved:: 2018/07/10 9:42 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates