-
Bug
-
Resolution: Done
-
Critical
-
8.2.4.Final, 9.0.0.Final
This causes random failures in ClusterTopologyManagerTest.testAbruptLeaveAfterGetStatus2. Node D leaves and nodes E and F start a rebalance. Then node F also leaves, but DISCARD is enabled first, so the CacheTopologyControlCommand(LEAVE) synchronous RPC blocks until node F installs a view by itself.
14:05:17,187 INFO (Incoming-1,Test-NodeF-12732:[]) [JGroupsTransport] ISPN000094: Received new cluster view for channel ISPN: [Test-NodeE-18953|3] (2) [Test-NodeE-18953, Test-NodeF-12732] 14:05:17,261 TRACE (testng-Test:[]) [JGroupsTransport] dests=[Test-NodeE-18953], command=CacheTopologyControlCommand{cache=testCache, type=LEAVE, sender=Test-NodeF-12732, joinInfo=null, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, actualMembers=null, throwable=null, viewId=3}, mode=SYNCHRONOUS, timeout=240000 14:05:22,546 DEBUG (Thread-88606:[]) [GMS] Test-NodeF-12732: installing view [Test-NodeF-12732|4] (1) [Test-NodeF-12732] 14:05:22,589 DEBUG (testng-Test:[]) [LocalTopologyManagerImpl] Error sending the leave request for cache testCache to coordinator
Node F then cancels its inbound state transfers by sending a synchronous StateRequestCommand(CANCEL_STATE_TRANSFER) RPC to node F. This RPC sometimes hits JGRP-2103 and it only times out after 5 minutes, causing the test to fail.
14:05:22,608 TRACE (testng-Test:[]) [JGroupsTransport] dests=[Test-NodeE-18953], command=StateRequestCommand{cache=testCache, origin=Test-NodeF-12732, type=CANCEL_STATE_TRANSFER, topologyId=7, segments=[130, 3, ...]}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=240000 14:05:22,608 TRACE (testng-Test:[]) [RequestCorrelator] Test-NodeF-12732: invoking multicast RPC [req-id=134161] 14:05:22,609 DEBUG (Thread-88606:[]) [JGroupsTransport] New view accepted: [Test-NodeF-12732|4] (1) [Test-NodeF-12732] 14:09:22,608 TRACE (timeout-thread-Test-NodeF-p48767-t1:[]) [JGroupsTransport] Responses: Responses{ Test-NodeE-18953: sender=Test-NodeE-18953, received=false, suspected=true} 14:09:22,608 TRACE (timeout-thread-Test-NodeF-p48767-t1:[]) [RpcManagerImpl] Replication exception org.infinispan.util.concurrent.TimeoutException: Replication timeout at org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$2(JGroupsTransport.java:659) ~[classes/:?] at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) ~[?:1.8.0_101] at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) ~[?:1.8.0_101] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) [?:1.8.0_101] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) [?:1.8.0_101] at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:47) [classes/:?] at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:15) [classes/:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101] Suppressed: org.infinispan.remoting.RpcException: Test-NodeE-18953 was suspected at org.infinispan.remoting.transport.jgroups.JGroupsTransport.addSuppressedExceptions(JGroupsTransport.java:707) ~[classes/:?] at org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$2(JGroupsTransport.java:659) ~[classes/:?] at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) ~[?:1.8.0_101] at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) ~[?:1.8.0_101] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) [?:1.8.0_101] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) [?:1.8.0_101] at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:47) [classes/:?] at org.infinispan.remoting.transport.jgroups.RspListFuture.call(RspListFuture.java:15) [classes/:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101] 14:09:22,609 DEBUG (testng-Test:[]) [InboundTransferTask] Caught an exception while cancelling state transfer for segments [130, 3, ...] from Test-NodeE-18953
- causes
-
JBEAP-3781 WARN ISPN000112: exception while committing: javax.transaction.xa.XAException due to replication timeout
- Verified
-
JBEAP-3946 TimeoutException: Unable to acquire lock after 15 seconds in repl-async stress tests
- Verified
-
JBEAP-3779 Error executing command PrepareCommand/GetKeyValueCommand due to replication timeout
- Verified
-
JBEAP-3780 Error while processing a prepare in a single-phase transaction due to replication timeout
- Verified
-
JBEAP-3782 WARN HeuristicMixedException due to replication timeout
- Verified
-
JBEAP-6202 Undertow request failed HttpServerExchange due to replication timeout
- Closed
- is incorporated by
-
JBEAP-11886 [GSS](7.0.z) TimeoutException: Unable to acquire lock after 15 seconds in repl-async stress tests
- Closed
-
ISPN-7101 Backports for 8.2.5.Final
- Closed
- is related to
-
JGRP-2103 GroupRequest times out when the only recipient leaves too soon
- Resolved
-
ISPN-6970 Upgrade to JGroups 4.0.0.Beta1
- Closed