-
Bug
-
Resolution: Done
-
Major
-
1.0.17.Final, 2.0.1.Final
-
None
We use scoped EJB contexts to connect between wildfly servers and faced the following issue.
Connection to some servers were blocked by firewall, so any connection attempt failed with timeout. It’s value was set to 1 minute.
So, connection hanged for 1 minute at org.jboss.ejb.client.remoting.ConnectionPool#getConnection at IOFutureHelper#get call.
Since this method is synchronized other threads couldn’t connect to other servers as well which caused major performance slowdown on entire platform.
Firewall configuration was incorrect, but anyway we need to make sure that if we add some servers and forget to adjust firewall configuration - it won’t lead to entire platform slowdown.
So, to avoid this issue in future we decreased timeout to 15 seconds and applied patch to jboss-ejb-client 2.0.1 distributive inside wildfly (attached).
The patch removes synchronized methods and creates unique synchronization mutexes based on connection properties.
We also cancel IOFuture in case connection failed to timeout.
The reason for that we had the issue with many blocked threads hanged at AbstractHandleableCloseable#close. The were few exceptions "Operation failed with status WAITING" just before this hang happened.
Stacktrace from jboss 7.2.0 and ejbclient 1.0.17
"EJB executorServiceThreadPool - 41" prio=10 tid=0x000000001d9e6000 nid=0x7754 in Object.wait() [0x00000000435fb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000e0ef2118> (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.jboss.remoting3.spi.AbstractHandleableCloseable.close(AbstractHandleableCloseable.java:177) - locked <0x00000000e0ef2118> (a java.lang.Object) at org.jboss.ejb.client.remoting.EndpointPool.safeClose(EndpointPool.java:281) at org.jboss.ejb.client.remoting.EndpointPool.release(EndpointPool.java:101) - locked <0x00000000ddf868f8> (a org.jboss.ejb.client.remoting.EndpointPool) at org.jboss.ejb.client.remoting.EndpointPool.access$400(EndpointPool.java:61) at org.jboss.ejb.client.remoting.EndpointPool$PooledEndpoint.close(EndpointPool.java:218) at org.jboss.ejb.client.remoting.RemotingEndpointManager.safeClose(RemotingEndpointManager.java:63) - locked <0x00000000e0ef21d8> (a java.util.Collections$SynchronizedRandomAccessList) at org.jboss.ejb.client.remoting.ConfigBasedEJBClientContextSelector$ContextCloseListener.contextClosed(ConfigBasedEJBClientContextSelector.java:175) at org.jboss.ejb.client.EJBClientContext.close(EJBClientContext.java:1047) - locked <0x00000000e0ef2200> (a org.jboss.ejb.client.EJBClientContext) at org.jboss.ejb.client.naming.ejb.EjbNamingContext.close(EjbNamingContext.java:424) at com.kyriba.technical.shared.internal.util.naming.ContextHelper.closeQuietly(ContextHelper.java:169)
I don't know for sure that it's effective, because we also applied Service locator pattern in our code to avoid closing of naming contexts at all, but it seems like logical thing to do.