Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-6254

State transfer hangs after CacheNotFoundResponse

XMLWordPrintable

      When StateConsumerImpl.addTransfer() sends a StateRequestCommand(START_STATE_TRANSFER), it doesn't block the thread until it receives all the state (ISPN-5019). Instead, it calls SemaphoreCompletionService.continueTaskInBackground(), and then it calls SemaphoreCompletionService.backgroundTaskFinished() when it receives the response.

      However, if the request wasn't successful, e.g. because the node providing the state is shutting down and replies with a CacheNotFoundResponse, SemaphoreCompletionService.backgroundTaskFinished() isn't called, and state transfer cannot make any progress.

      This shows up in the test suite as random failures in NumOwnersNodeCrashInSequenceTest and NumOwnersNodeStopInSequenceTest:

      17:49:16,960 ERROR (testng-NumOwnersNodeCrashInSequenceTest:) [UnitTestTestNGListener] Test testNodeCrashedBeforeStFinished0(org.infinispan.partitionhandling.NumOwnersNodeCrashInSequenceTest) failed.
      org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.Exception on object of type StateTransferManagerImpl
      	at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:172) ~[infinispan-commons-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
      	at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:887) ~[classes/:?]
      	at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:656) ~[classes/:?]
      	at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:645) ~[classes/:?]
      	at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:548) ~[classes/:?]
      	at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:222) ~[classes/:?]
      	at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:849) ~[classes/:?]
      	at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:624) ~[classes/:?]
      	at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:577) ~[classes/:?]
      	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:445) ~[classes/:?]
      	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:431) ~[classes/:?]
      	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:413) ~[classes/:?]
      	at org.infinispan.test.MultipleCacheManagersTest.getCaches(MultipleCacheManagersTest.java:211) ~[test-classes/:?]
      	at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:220) ~[test-classes/:?]
      	at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:229) ~[test-classes/:?]
      	at org.infinispan.partitionhandling.NumOwnersNodeCrashInSequenceTest.testNodeCrashedBeforeStFinished(NumOwnersNodeCrashInSequenceTest.java:105) ~[test-classes/:?]
      	at org.infinispan.partitionhandling.NumOwnersNodeCrashInSequenceTest.testNodeCrashedBeforeStFinished0(NumOwnersNodeCrashInSequenceTest.java:67) ~[test-classes/:?]
      

              dberinde@redhat.com Dan Berindei (Inactive)
              dberinde@redhat.com Dan Berindei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: