-
Bug
-
Resolution: Done
-
Major
-
7.3.0.GA.CR3
We observed this errors in Clustering fail-over test where nodes are failed via JVM kill:
2020-01-30 08:55:58,452 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (async-thread--p26-t1) ISPN000136: Error executing command RemoveCommand on Cache 'clusterbench-ee8.ear.a.war', writing keys [SessionCreationMetaDataKey(7DW-VFPUGOVPBF1VymSNzkpZurUyiXUF4CXKk-L5)]: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key SessionCreationMetaDataKey(7DW-VFPUGOVPBF1VymSNzkpZurUyiXUF4CXKk-L5) and requestor GlobalTx:wildfly1:117. Lock is held by GlobalTx:wildfly2:70 at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.get(DefaultLockManager.java:288) at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.get(DefaultLockManager.java:218) at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.InfinispanLock$LockPlaceHolder.checkState(InfinispanLock.java:436) at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.InfinispanLock$LockPlaceHolder.lambda$toInvocationStage$3(InfinispanLock.java:412) at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642) at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.wildfly.clustering.service@7.3.0.GA-redhat-00003//org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47) at java.base/java.lang.Thread.run(Thread.java:834)
In these tests we have a 2 nodes EAP cluster offloading session data to a 2 nodes JDG cluster; here is the cli script used to configure the EAP nodes:
embed-server --server-config=standalone-ha.xml /subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcp) /subsystem=transactions:write-attribute(name=node-identifier,value=wildfly2) /socket-binding-group=standard-sockets/remote-destination-outbound-socket-binding=remote-jdg-server1:add(host=10.0.147.197, port=11222) /socket-binding-group=standard-sockets/remote-destination-outbound-socket-binding=remote-jdg-server2:add(host=10.0.147.209, port=11222) batch /subsystem=infinispan/remote-cache-container=web-sessions:add(default-remote-cluster=jdg-server-cluster) /subsystem=infinispan/remote-cache-container=web-sessions/remote-cluster=jdg-server-cluster:add(socket-bindings=[remote-jdg-server1,remote-jdg-server2]) run-batch /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic:add() /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic/store=hotrod:add(remote-cache-container=web-sessions, fetch-state=false, preload=false, passivation=false, purge=false, shared=false) /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic/component=locking:add(isolation=REPEATABLE_READ) /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic/component=transaction:add(mode=BATCH) /subsystem=infinispan/cache-container=web:write-attribute(name=default-cache, value=offload_ic)
The overall fail rate increases from 0.6% to 0.7% and is under the 2% threshold that makes the test fail;
What makes this errors worth noting is that they happen BEFORE the EAP nodes are failed; this did not happen in 7.2;
Complete logs attached;
Test phases overview here;
- is blocked by
-
JBEAP-18841 Upgrade Infinispan to 9.4.18.Final
- Closed
- is caused by
-
WFLY-13168 Invalidation caches use wrong key affinity
- Closed
- is incorporated by
-
JBEAP-18409 [GSS](7.3.z) Upgrade Infinispan from 9.4.16.Final-redhat-00002 to 9.4.18.Final-redhat-00001
- Closed