Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-10269

ProtobufMetadataManagerImpl.registerProtofiles always times out

    XMLWordPrintable

Details

    Description

      ProtobufMetadataManagerImpl.registerProtofiles() uses protobufSchemaCache.putAll(files) under the covers, and because of a bug ProtobufMetadataManagerInterceptor.visitPutMapCommand() it tries to lock the .errors key both on the originator and on the remote nodes (during remote prepare). The LockControlCommand goes through TransactionSynchronizerInterceptor and tries to "lock" the transaction with enterSynchronizationAsync(), but it hangs because the PrepareCommand invocation can't continue and complete its releaseFuture.

      00:59:56,177 ERROR [org.infinispan.transaction.impl.TransactionCoordinator] (pool-8-thread-1) ISPN000097: Error while processing a prepare in a single-phase transaction: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:263)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:159)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionTable.afterCompletion(TransactionTable.java:870)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.synchronization.SynchronizationAdapter.afterCompletion(SynchronizationAdapter.java:33)
      	at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction.performConsumer(AbstractTransaction.java:223)
      	at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction$AssociatingSynchronization.afterCompletion(AbstractTransaction.java:306)
      	at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.afterCompletion(SynchronizationImple.java:96)
      	at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.afterCompletion(TwoPhaseCoordinator.java:545)
      	at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:101)
      	at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162)
      	at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1288)
      	at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
      	at org.jboss.jts.integration@5.9.0.Final//com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:89)
      	at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.LocalTransaction.commitAndDissociate(LocalTransaction.java:77)
      	at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.ContextTransactionManager.commit(ContextTransactionManager.java:71)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.tryCommit(CacheImpl.java:1981)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandWithInjectedTx(CacheImpl.java:1901)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1874)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:1458)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:2005)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:469)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.AbstractDelegatingCache.putAll(AbstractDelegatingCache.java:459)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.EncoderCache.putAll(EncoderCache.java:693)
      	at org.infinispan.remote-query.server:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.query.remote.impl.ProtobufMetadataManagerImpl.registerProtofiles(ProtobufMetadataManagerImpl.java:166)
      Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
      	at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
      	... 3 more
      

      enterSynchronizationAsync() doesn't have any timeout. When the prepare times out on the originator and it sends a RollbackCommand to all the affected nodes, the RollbackCommand will hang as well. However, because ProtobufMetadataManagerImpl configures the protobuf metadata cache with invocationBatching().enable(), it uses Synchronization, and all the exceptions are hidden from the user.

      The failure can still causes failures later in RemoteQueryDescriptorIT:

      00:57:59.423 [ERROR] testDescriptorPropagation(org.infinispan.server.test.query.RemoteQueryDescriptorIT)  Time elapsed: 38.848 s  <<< FAILURE!
      java.lang.AssertionError
      	at org.infinispan.server.test.query.RemoteQueryDescriptorIT.registerProtoOnServer1(RemoteQueryDescriptorIT.java:83)
      	at org.infinispan.server.test.query.RemoteQueryDescriptorIT.testDescriptorPropagation(RemoteQueryDescriptorIT.java:59)
      

      Interesting enough, it never fails when running all the tests in server/integration/testsuite on master, but it fails when running only RemoteQueryDescriptorIT with mvn verify -pl server/integration/testsuite -Dit.test=RemoteQueryDescriptorIT -Psuite.queries, and it also fails when running all the tests after I removed StringBasedStoreMultinodeIT.testSingleton. Maybe it's because multiple ITs share the same global state persistence location?

      Attachments

        Issue Links

          Activity

            People

              dberinde@redhat.com Dan Berindei (Inactive)
              dberinde@redhat.com Dan Berindei (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: