Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4846

State transfer keeps trying to fetch transaction data after the cache was stopped

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 9.0.0.Final
    • 7.0.0.CR1
    • Core, State Transfer
    • None

      StateConsumerImpl doesn't check if the cache is stopped while fetching transaction data, it only stops when it's no longer able to find providers for transactions.

      However, JGroupsTransport throws a generic CacheException when the channel is stopped. The state transfer thread can enter a busy-wait loop, retrying to get the transaction data and immediately getting the CacheException, filling the log with messages like this:

      19:32:28,237 WARN  (remote-thread-NodeN-p42592-t1:) [StateConsumerImpl] ISPN000209: Failed to retrieve transactions for segments [10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 42, 43, 40, 41, 46, 47, 44, 45, 51, 50, 49, 48, 55, 54, 53, 52, 59, 58, 57, 56] of cache testCache from node NodeM-53416
      org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not connected
      	at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655)
      	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176)
      	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536)
      	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:290)
      	at org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:766)
      	at org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:685)
      	at org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:629)
      	at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:331)
      	at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:195)
      	at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:43)
      	at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:116)
      

      We should check is the cache is stopped before retrying in StateConsumerImpl.requestTransactions. I also think we should change the stop order - it would make sense to stop the remote executor threads and the RpcDispatcher before we stop the channel.

              Unassigned Unassigned
              dberinde@redhat.com Dan Berindei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: