-
Bug
-
Resolution: Done
-
Blocker
-
None
-
None
Scenario
- There are two Artemis brokers configured to form cluster
- There is a producer sending messages to broker 1 and receiver receiving messages from broker 2
- Between the brokers there is a proxy which simulates network failure
- The proxy is several times stopped and restarted to simulate the network failure
- The test expects that all messages sent to broker 1 will be received by receiver from broker 2 (despite the network failures)
Reality: After the proxy is stopped and restarted, the cluster is not able to form again. Both brokers try to reconnect to their opposites but with no luck.
Customer scenario: Messaging cluster is not able to recover after network failures.
Investigation of issue
I investigated why brokers are not able to reconnect and I found out that always when they try to reconnect, they give it up because there is no topology record for nodeId where they try to connect. So the re-connection attempt ends here [1].
I compared the behavior with Artemis 1.x and I found out that Artemis 2.x removes the topology member when connection failure is detected, but Artemis 1.x doesn't. When I commented the line [2] it fixed the issue. This line is not present in 1.x.
[1] https://github.com/apache/activemq-artemis/blob/b66d0f7ac40001cce14ca7146e74720504ff9eb1/artemis-core-client/src/main/java/org/apache/activemq/artemis/core/client/impl/ServerLocatorImpl.java#L659
[2] https://github.com/apache/activemq-artemis/blob/b66d0f7ac40001cce14ca7146e74720504ff9eb1/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/cluster/impl/BridgeImpl.java#L782
- blocks
-
WFLY-10320 Upgrade artemis from 1.5.x to 2.x.x
- Closed
- clones
-
WFWIP-13 Regression in messaging cluster tests with network failures
- Resolved
- is duplicated by
-
JBEAP-14175 Core bridge does not reconnect when target node is restarted
- Closed
- is related to
-
JBEAP-14175 Core bridge does not reconnect when target node is restarted
- Closed
-
ARTEMIS-1752 Loading...