-
Bug
-
Resolution: Done
-
Major
-
None
-
None
The problem is that the get operation will eventually invoke the `ExpirationManagerImpl` which handles expiration in memory and returns a completable future. The `AbstractInternalDataContainer` does a join until the CF completes.
Basically, the get operation starts from the Netty thread until the expiration method. The expiration method delegates to the blocking manager to run on a blocking thread, and resumes execution on the non-blocking thread. But this causes a deadlock. The non-blocking thread is blocked on the join waiting for the CF to complete, and it can resume the CF completion.
From the log:
12:48:15,440 TRACE (non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1:[]) [o.i.s.r.RespHandler] Received command: GET with arguments [[B0x6B312D7465737452656E616D65576974..[19]] for [id: 0x7bde59ac, L:/127.0.0.1:2518 - R:/127.0.0.1:42954] 12:48:15,440 TRACE (non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1:[]) [o.i.u.c.BlockingManagerImpl] Submitting blocking supply operation local-expiration to blocking thread 12:48:15,440 TRACE (blocking-thread-RespSingleNodeTest-NodeA-p8-t2:[]) [o.i.u.c.BlockingManagerImpl] Running blocking supply operation local-expiration 12:48:30,447 ERROR (testng-RespSingleNodeTest:[]) [o.i.c.t.TestSuiteProgress] Test failed: org.infinispan.server.resp.RespSingleNodeTest.testRenameWithEx\[simpleCache\=true, cacheMode\=LOCAL\] io.lettuce.core.RedisCommandTimeoutException: Command timed out after 15 second(s) at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:59) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea] at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:246) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea] at io.lettuce.core.FutureSyncInvocationHandler.handleInvocation(FutureSyncInvocationHandler.java:75) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea] at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:80) ~[lettuce-core-6.3.2.RELEASE.jar:6.3.2.RELEASE/8941aea] at jdk.proxy2/jdk.proxy2.$Proxy42.get(Unknown Source) ~[?:?] at org.infinispan.server.resp.RespSingleNodeTest.testRenameWithEx(RespSingleNodeTest.java:1115) ~[test-classes/:?]
Notice that the resume messages are missing. In the thread dump, we have the non-blocking thread (non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1) waiting on the expiration operation and the thread (blocking-thread-RespSingleNodeTest-NodeA-p8-t2) is not doing any work.
"non-blocking-thread-RespSingleNodeTest-NodeA-p7-t1" #48 [80728] daemon prio=5 os_prio=0 cpu=543.06ms elapsed=557.75s tid=0x00007f59b000c930 nid=80728 waiting on condition [0x00007f5ab81fd000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@21.0.1/Native Method) - parking to wait for <0x00000000d08c8360> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park(java.base@21.0.1/LockSupport.java:221) at java.util.concurrent.CompletableFuture$Signaller.block(java.base@21.0.1/CompletableFuture.java:1864) at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@21.0.1/ForkJoinPool.java:3780) at java.util.concurrent.ForkJoinPool.managedBlock(java.base@21.0.1/ForkJoinPool.java:3725) at java.util.concurrent.CompletableFuture.waitingGet(java.base@21.0.1/CompletableFuture.java:1898) at java.util.concurrent.CompletableFuture.join(java.base@21.0.1/CompletableFuture.java:2117) at org.infinispan.container.impl.AbstractInternalDataContainer.get(AbstractInternalDataContainer.java:94) at org.infinispan.container.impl.AbstractInternalDataContainer.get(AbstractInternalDataContainer.java:105) at org.infinispan.cache.impl.SimpleCacheImpl.get(SimpleCacheImpl.java:561) at org.infinispan.cache.impl.StatsCollectingCache.get(StatsCollectingCache.java:49) at org.infinispan.cache.impl.SimpleCacheImpl.getAsync(SimpleCacheImpl.java:1096) at org.infinispan.cache.impl.EncoderCache.getAsync(EncoderCache.java:1025) at org.infinispan.server.resp.commands.string.GET.perform(GET.java:35) at org.infinispan.server.resp.Resp3Handler.actualHandleRequest(Resp3Handler.java:88) at org.infinispan.server.resp.RespRequestHandler.handleRequest(RespRequestHandler.java:88) at org.infinispan.server.resp.RespHandler.handleCommandAndArguments(RespHandler.java:149) at org.infinispan.server.resp.RespHandler.channelRead(RespHandler.java:131) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93) at org.infinispan.server.core.transport.StatsChannelHandler.channelRead(StatsChannelHandler.java:28) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:801) at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:501) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:399) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.lang.Thread.runWith(java.base@21.0.1/Thread.java:1596) at java.lang.Thread.run(java.base@21.0.1/Thread.java:1583) Locked ownable synchronizers: - None "blocking-thread-RespSingleNodeTest-NodeA-p8-t2" #50 [80729] daemon prio=5 os_prio=0 cpu=2.09ms elapsed=557.27s tid=0x00007f59cc01fa90 nid=80729 waiting on condition [0x00007f5ab0cfe000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@21.0.1/Native Method) - parking to wait for <0x00000000c79b2a68> (a org.jboss.threads.EnhancedQueueExecutor) at java.util.concurrent.locks.LockSupport.park(java.base@21.0.1/LockSupport.java:221) at org.jboss.threads.EnhancedQueueExecutor$PoolThreadNode.park(EnhancedQueueExecutor.java:2259) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1553) at java.lang.Thread.runWith(java.base@21.0.1/Thread.java:1596) at java.lang.Thread.run(java.base@21.0.1/Thread.java:1583) Locked ownable synchronizers: - None