Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4846

State transfer keeps trying to fetch transaction data after the cache was stopped

    XMLWordPrintable

    Details

      Description

      StateConsumerImpl doesn't check if the cache is stopped while fetching transaction data, it only stops when it's no longer able to find providers for transactions.

      However, JGroupsTransport throws a generic CacheException when the channel is stopped. The state transfer thread can enter a busy-wait loop, retrying to get the transaction data and immediately getting the CacheException, filling the log with messages like this:

      19:32:28,237 WARN  (remote-thread-NodeN-p42592-t1:) [StateConsumerImpl] ISPN000209: Failed to retrieve transactions for segments [10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 42, 43, 40, 41, 46, 47, 44, 45, 51, 50, 49, 48, 55, 54, 53, 52, 59, 58, 57, 56] of cache testCache from node NodeM-53416
      org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not connected
      	at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655)
      	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176)
      	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536)
      	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:290)
      	at org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:766)
      	at org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:685)
      	at org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:629)
      	at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:331)
      	at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:195)
      	at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:43)
      	at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:116)
      

      We should check is the cache is stopped before retrying in StateConsumerImpl.requestTransactions. I also think we should change the stop order - it would make sense to stop the remote executor threads and the RpcDispatcher before we stop the channel.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              dan.berindei Dan Berindei
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: