-
Bug
-
Resolution: Done
-
Major
-
7.0.0.CR1
-
None
StateConsumerImpl doesn't check if the cache is stopped while fetching transaction data, it only stops when it's no longer able to find providers for transactions.
However, JGroupsTransport throws a generic CacheException when the channel is stopped. The state transfer thread can enter a busy-wait loop, retrying to get the transaction data and immediately getting the CacheException, filling the log with messages like this:
19:32:28,237 WARN (remote-thread-NodeN-p42592-t1:) [StateConsumerImpl] ISPN000209: Failed to retrieve transactions for segments [10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 42, 43, 40, 41, 46, 47, 44, 45, 51, 50, 49, 48, 55, 54, 53, 52, 59, 58, 57, 56] of cache testCache from node NodeM-53416 org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not connected at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:290) at org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:766) at org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:685) at org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:629) at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:331) at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:195) at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:43) at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:116)
We should check is the cache is stopped before retrying in StateConsumerImpl.requestTransactions. I also think we should change the stop order - it would make sense to stop the remote executor threads and the RpcDispatcher before we stop the channel.
- is related to
-
ISPN-7977 Clean up NullPointerExceptions in the core test logs
- Closed