-
Bug
-
Resolution: Done
-
Major
-
None
-
AMQ 7.8.0.GA
-
None
- [Problem] When many core bridges for two-way TLS acceptor with client certificate based authentication are defined, only some of them can connect.
- For example:
- If 30 core bridges are defined, only 10 core bridges can connect.
- If 10 core bridges are defined, all the core bridges can connect.
- For example:
- [Cause]
- To create a connection to connect to two-way TLS acceptor with client certificate based authentication, it can take about one second.
- AMQ Broker executes a synchronized method to create a TLS connection[2], so it is not possible to create TLS connections in parallel, the connections are created sequentially.
- You can confirm that many threads are blocked in org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor.getSslHandler from the thread dump[2].
- And a core bridge can wait only 10 seconds to create a TLS connection.
- If it exceeds 10 seconds:
- The following error occurs on the core bridge on the source broker.
- AMQ214016: Failed to create netty connection: io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms[1]
- and CLOSE_WAIT connections occur on the destination broker[3].
- The following error occurs on the core bridge on the source broker.
- If it exceeds 10 seconds:
- To create a connection to connect to two-way TLS acceptor with client certificate based authentication, it can take about one second.
- [For example]:
- Say that it will take one second to create a TLS connection and 30 core bridges are defined.
==> - Only 10 core bridges can connect, but other 20 core bridges cannot connect.
- The first 10 core bridges can be connected because the broker will respond within 10 seconds.
- FYI, the re-connect feature of the core bridge works as expect, but useless.
- Other than the first 10 core bridges, the remaining 20 core bridges that could not be connected are repeatedly trying to reconnect,
- but they are always in the queue in the destination broker and it repeatedly takes more than 20 seconds for them to respond.
- It may seem somewhat counterintuitive, but If 20 core bridges are defined, all the core bridges can connect,
- because other than the first 10 core bridges, the remaining 10 core bridges can receive a response within 10 seconds by reconnecting after the timeout exception.
- Other than the first 10 core bridges, the remaining 20 core bridges that could not be connected are repeatedly trying to reconnect,
- Say that it will take one second to create a TLS connection and 30 core bridges are defined.
- [1] io.netty.handler.ssl.SslHandshakeTimeoutException
2021-04-07 08:47:16,912 ERROR [org.apache.activemq.artemis.core.client] AMQ214016: Failed to create netty connection: io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms at io.netty.handler.ssl.SslHandler$5.run(SslHandler.java:2062) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-all-4.1.51.Final-redhat-00001.jar:4.1.51.Final-redhat-00001] at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.16.0.redhat-00007.jar:2.16.0.redhat-00007]
- [2] Thread dump: for 60 seconds during a problem
** https://privatebin-it-iso.int.open.paas.redhat.com/?6dd0b767583d15a4#9DWGtDkRBjYUYmY3vXZGVF21kQEEzm9W7oZ17vbKiNaP
- [3] CLOSE_WAIT connections on the destination broker
$ netstat -aon -p|grep java|sort tcp6 0 0 172.31.32.201:61617 18.179.5.82:43429 ESTABLISHED 21596/java keepalive (7191.27/0/0) tcp6 0 0 172.31.32.201:61617 18.179.5.82:43430 ESTABLISHED 21596/java keepalive (7191.27/0/0) tcp6 0 0 172.31.32.201:61617 18.179.5.82:43431 ESTABLISHED 21596/java keepalive (7191.27/0/0) tcp6 0 0 172.31.32.201:61617 18.179.5.82:43435 ESTABLISHED 21596/java keepalive (7191.27/0/0) tcp6 0 0 172.31.32.201:61617 18.179.5.82:43436 ESTABLISHED 21596/java keepalive (7191.27/0/0) tcp6 0 0 172.31.32.201:61617 18.179.5.82:43437 ESTABLISHED 21596/java keepalive (7191.27/0/0) tcp6 0 0 :::61616 :::* LISTEN 21596/java off (0.00/0/0) tcp6 0 0 :::61618 :::* LISTEN 21596/java off (0.00/0/0) tcp6 0 0 :::61619 :::* LISTEN 21596/java off (0.00/0/0) tcp6 0 0 :::8161 :::* LISTEN 21596/java off (0.00/0/0) tcp6 35 0 :::61617 :::* LISTEN 21596/java off (0.00/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43428 CLOSE_WAIT 21596/java keepalive (7191.27/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43433 CLOSE_WAIT 21596/java keepalive (7191.27/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43434 CLOSE_WAIT 21596/java keepalive (7191.27/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43439 CLOSE_WAIT 21596/java keepalive (7224.04/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43440 CLOSE_WAIT 21596/java keepalive (7224.04/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43441 CLOSE_WAIT 21596/java keepalive (7224.04/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43442 CLOSE_WAIT 21596/java keepalive (7224.04/0/0) tcp6 191 0 172.31.32.201:61617 18.179.5.82:43445 CLOSE_WAIT 21596/java keepalive (7224.04/0/0) unix 2 [ ] STREAM CONNECTED 5799590 21596/java unix 2 [ ] STREAM CONNECTED 5799592 21596/java
- clones
-
ENTMQBR-4879 Cannot connect if many core bridges for two-way TLS acceptor are defined
-
- Closed
-