Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-16699

RESP blocking during get with expiration in simple cache

XMLWordPrintable

      The problem is that the get operation will eventually invoke the `ExpirationManagerImpl` which handles expiration in memory and returns a completable future. The `AbstractInternalDataContainer` does a join until the CF completes.

      Basically, the get operation starts from the Netty thread until the expiration method. The expiration method delegates to the blocking manager to run on a blocking thread, and resumes execution on the non-blocking thread. But this causes a deadlock. The non-blocking thread is blocked on the join waiting for the CF to complete, and it can resume the CF completion.

      From the log:

      12:48:15,440 TRACE (non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1:[]) [o.i.s.r.RespHandler] Received command: GET with arguments [[B0x6B312D7465737452656E616D65576974..[19]] for [id: 0x7bde59ac, L:/127.0.0.1:2518 - R:/127.0.0.1:42954]
      12:48:15,440 TRACE (non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1:[]) [o.i.u.c.BlockingManagerImpl] Submitting blocking supply operation local-expiration to blocking thread
      12:48:15,440 TRACE (blocking-thread-RespSingleNodeTest-NodeA-p8-t2:[]) [o.i.u.c.BlockingManagerImpl] Running blocking supply operation local-expiration
      12:48:30,447 ERROR (testng-RespSingleNodeTest:[]) [o.i.c.t.TestSuiteProgress] Test failed: org.infinispan.server.resp.RespSingleNodeTest.testRenameWithEx\[simpleCache\=true, cacheMode\=LOCAL\]
      io.lettuce.core.RedisCommandTimeoutException: Command timed out after 15 second(s)
      	at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:59) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea]
      	at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:246) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea]
      	at io.lettuce.core.FutureSyncInvocationHandler.handleInvocation(FutureSyncInvocationHandler.java:75) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea]
      	at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:80) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea]
      	at jdk.proxy2/jdk.proxy2.$Proxy42.get(Unknown Source) ~[?:?]
      	at org.infinispan.server.resp.RespSingleNodeTest.testRenameWithEx(RespSingleNodeTest.java:1115) ~[test-classes/:?]
      

      Notice that the resume messages are missing. In the thread dump, we have the non-blocking thread (non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1) waiting on the expiration operation and the thread (blocking-thread-RespSingleNodeTest-NodeA-p8-t2) is not doing any work.

      "non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1" #48 [80728] daemon prio=5 os_prio=0 cpu=543.06ms elapsed=557.75s tid=0x00007f59b000c930 nid=80728 waiting on condition  [0x00007f5ab81fd000]
         java.lang.Thread.State: WAITING (parking)
      	at jdk.internal.misc.Unsafe.park(java.base@21.0.1/Native Method)
      	- parking to wait for  <0x00000000d08c8360> (a java.util.concurrent.CompletableFuture$Signaller)
      	at java.util.concurrent.locks.LockSupport.park(java.base@21.0.1/LockSupport.java:221)
      	at java.util.concurrent.CompletableFuture$Signaller.block(java.base@21.0.1/CompletableFuture.java:1864)
      	at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@21.0.1/ForkJoinPool.java:3780)
      	at java.util.concurrent.ForkJoinPool.managedBlock(java.base@21.0.1/ForkJoinPool.java:3725)
      	at java.util.concurrent.CompletableFuture.waitingGet(java.base@21.0.1/CompletableFuture.java:1898)
      	at java.util.concurrent.CompletableFuture.join(java.base@21.0.1/CompletableFuture.java:2117)
      	at org.infinispan.container.impl.AbstractInternalDataContainer.get(AbstractInternalDataContainer.java:94)
      	at org.infinispan.container.impl.AbstractInternalDataContainer.get(AbstractInternalDataContainer.java:105)
      	at org.infinispan.cache.impl.SimpleCacheImpl.get(SimpleCacheImpl.java:561)
      	at org.infinispan.cache.impl.StatsCollectingCache.get(StatsCollectingCache.java:49)
      	at org.infinispan.cache.impl.SimpleCacheImpl.getAsync(SimpleCacheImpl.java:1096)
      	at org.infinispan.cache.impl.EncoderCache.getAsync(EncoderCache.java:1025)
      	at org.infinispan.server.resp.commands.string.GET.perform(GET.java:35)
      	at org.infinispan.server.resp.Resp3Handler.actualHandleRequest(Resp3Handler.java:88)
      	at org.infinispan.server.resp.RespRequestHandler.handleRequest(RespRequestHandler.java:88)
      	at org.infinispan.server.resp.RespHandler.handleCommandAndArguments(RespHandler.java:149)
      	at org.infinispan.server.resp.RespHandler.channelRead(RespHandler.java:131)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
      	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
      	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
      	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93)
      	at org.infinispan.server.core.transport.StatsChannelHandler.channelRead(StatsChannelHandler.java:28)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
      	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
      	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
      	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
      	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
      	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:801)
      	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:501)
      	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:399)
      	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
      	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
      	at java.lang.Thread.runWith(java.base@21.0.1/Thread.java:1596)
      	at java.lang.Thread.run(java.base@21.0.1/Thread.java:1583)
      
         Locked ownable synchronizers:
      	- None
      
      "blocking-thread-RespSingleNodeTest-NodeA-p8-t2" #50 [80729] daemon prio=5 os_prio=0 cpu=2.09ms elapsed=557.27s tid=0x00007f59cc01fa90 nid=80729 waiting on condition  [0x00007f5ab0cfe000]
         java.lang.Thread.State: WAITING (parking)
      	at jdk.internal.misc.Unsafe.park(java.base@21.0.1/Native Method)
      	- parking to wait for  <0x00000000c79b2a68> (a org.jboss.threads.EnhancedQueueExecutor)
      	at java.util.concurrent.locks.LockSupport.park(java.base@21.0.1/LockSupport.java:221)
      	at org.jboss.threads.EnhancedQueueExecutor$PoolThreadNode.park(EnhancedQueueExecutor.java:2259)
      	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1553)
      	at java.lang.Thread.runWith(java.base@21.0.1/Thread.java:1596)
      	at java.lang.Thread.run(java.base@21.0.1/Thread.java:1583)
      
         Locked ownable synchronizers:
      	- None
      

            rh-ee-jbolina Jose Bolina
            rh-ee-jbolina Jose Bolina
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: