Uploaded image for project: 'Red Hat Data Grid'
  1. Red Hat Data Grid
  2. JDG-3848

Another node leaving may break SingleFileStore

    XMLWordPrintable

    Details

      Description

      When an OutboundTransferTask on node A fails to send state to node B, it cancels itself, but it does not immediately stop iterating over the entries, instead it sets the interrupted flag on the current thread and keeps trying to send chunks:

      2020-05-27 04:59:51,298 DEBUG [org.infinispan.statetransfer.OutboundTransferTask] (infinispan 67656) Cancelling outbound transfer to node jboss90434-22044, segments {0-255}
      ...
      2020-05-27 04:59:54,032 DEBUG [org.infinispan.statetransfer.OutboundTransferTask] (infinispan 67248) Node jboss90434-22044 left cache assignedproductcache while we were sending state to it, cancelling transfer.
      2020-05-27 04:59:54,032 DEBUG [org.infinispan.statetransfer.OutboundTransferTask] (infinispan 67248) Node jboss90434-22044 left cache assignedproductcache while we were sending state to it, cancelling transfer.
      2020-05-27 04:59:54,033 DEBUG [org.infinispan.statetransfer.OutboundTransferTask] (infinispan 67248) Node jboss90434-22044 left cache assignedproductcache while we were sending state to it, cancelling transfer.
      

      If the cache has a SingleFileStore, the iterator will eventually try to obtain new entries from the store, and the FileChannel.read() operation will fail with a ClosedByInterruptException:

      2020-05-27 05:00:30,803 WARN  [org.infinispan.statetransfer.OutboundTransferTask] (infinispan 67248) ISPN000194: Failed loading keys from cache store: org.infinispan.persistence.spi.PersistenceException: java.nio.channels.ClosedByInterruptException
      	at org.infinispan.persistence.file.SingleFileStore._load(SingleFileStore.java:486) [infinispan-core-9.4.18.Final-redhat-00001.jar:9.4.18.Final-redhat-00001]
      	at org.infinispan.persistence.file.SingleFileStore.lambda$publishEntries$3(SingleFileStore.java:545) [infinispan-core-9.4.18.Final-redhat-00001.jar:9.4.18.Final-redhat-00001]
      	at io.reactivex.internal.operators.flowable.FlowableMap$MapConditionalSubscriber.tryOnNext(FlowableMap.java:123) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.FlowableFromIterable$IteratorConditionalSubscription.slowPath(FlowableFromIterable.java:373) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.FlowableFromIterable$BaseRangeSubscription.request(FlowableFromIterable.java:124) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.subscribers.BasicFuseableConditionalSubscriber.request(BasicFuseableConditionalSubscriber.java:152) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.subscribers.BasicFuseableSubscriber.request(BasicFuseableSubscriber.java:153) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.subscriptions.SubscriptionHelper.setOnce(SubscriptionHelper.java:249) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.BlockingFlowableIterable$BlockingFlowableIterator.onSubscribe(BlockingFlowableIterable.java:128) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.subscribers.BasicFuseableSubscriber.onSubscribe(BasicFuseableSubscriber.java:67) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.subscribers.BasicFuseableConditionalSubscriber.onSubscribe(BasicFuseableConditionalSubscriber.java:66) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.FlowableFromIterable.subscribe(FlowableFromIterable.java:66) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.FlowableFromIterable.subscribeActual(FlowableFromIterable.java:47) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.Flowable.subscribe(Flowable.java:14636) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.FlowableMap.subscribeActual(FlowableMap.java:35) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.Flowable.subscribe(Flowable.java:14636) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.FlowableFilter.subscribeActual(FlowableFilter.java:37) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.Flowable.subscribe(Flowable.java:14636) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.internal.operators.flowable.BlockingFlowableIterable.iterator(BlockingFlowableIterable.java:42) [rxjava-2.2.4.redhat-00005.jar:]
      	at io.reactivex.Flowable.blockingForEach(Flowable.java:5606) [rxjava-2.2.4.redhat-00005.jar:]
      	at org.infinispan.statetransfer.OutboundTransferTask.run(OutboundTransferTask.java:198) [infinispan-core-9.4.18.Final-redhat-00001.jar:9.4.18.Final-redhat-00001]
      Caused by: java.nio.channels.ClosedByInterruptException
      	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) [rt.jar:1.8.0_242]
      	at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:740) [rt.jar:1.8.0_242]
      	at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:721) [rt.jar:1.8.0_242]
      	at org.infinispan.persistence.file.SingleFileStore._load(SingleFileStore.java:484) [infinispan-core-9.4.18.Final-redhat-00001.jar:9.4.18.Final-redhat-00001]
      	... 27 more
      

      The ClosedByInterruptException closes SingleFileStore's only FileChannel, and from then on every store operation will fail with a ClosedChannelException:

      Caused by: java.nio.channels.ClosedChannelException
      	at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110) [rt.jar:1.8.0_242]
      	at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:715) [rt.jar:1.8.0_242]
      	at org.infinispan.persistence.file.SingleFileStore._load(SingleFileStore.java:484) [infinispan-core-9.4.18.Final-redhat-00001.jar:9.4.18.Final-redhat-00001]
      

        Attachments

          Activity

            People

            Assignee:
            dberinde@redhat.com Dan Berindei
            Reporter:
            dberinde@redhat.com Dan Berindei
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: