Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-18594

Clustering - org.infinispan.util.concurrent.TimeoutException in JDG Stress tests

XMLWordPrintable

      We observed this errors in Clustering stress tests where a 3 nodes EAP cluster offloads session data to a 2 nodes JDG cluster:

      2020-01-30 11:55:03,427 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (async-thread--p26-t1) ISPN000136: Error executing command RemoveCommand on Cache 'clusterbench-ee8.ear.a.war', writing keys [SessionCreationMetaDataKey(rWHFsds9Wvx7HGu_gmYd3oiyjukE0YM8HZZaAd0T)]: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key SessionCreationMetaDataKey(rWHFsds9Wvx7HGu_gmYd3oiyjukE0YM8HZZaAd0T) and requestor GlobalTx:wildfly1:112. Lock is held by GlobalTx:wildfly3:63
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.get(DefaultLockManager.java:288)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.get(DefaultLockManager.java:218)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.InfinispanLock$LockPlaceHolder.checkState(InfinispanLock.java:436)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.InfinispanLock$LockPlaceHolder.lambda$toInvocationStage$3(InfinispanLock.java:412)
      	at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
      	at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at org.wildfly.clustering.service@7.3.0.GA-redhat-00003//org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
      	at java.base/java.lang.Thread.run(Thread.java:834)
      

      The EAP nodes are configured as follows:

      embed-server --server-config=standalone-ha.xml
      /subsystem=jgroups/channel=ee:write-attribute(name=stack,value=udp)
      /subsystem=transactions:write-attribute(name=node-identifier,value=wildfly2)
      /socket-binding-group=standard-sockets/remote-destination-outbound-socket-binding=remote-jdg-server1:add(host=10.16.176.58, port=11222)
      /socket-binding-group=standard-sockets/remote-destination-outbound-socket-binding=remote-jdg-server2:add(host=10.16.176.56, port=11222)
      batch
      /subsystem=infinispan/remote-cache-container=web-sessions:add(default-remote-cluster=jdg-server-cluster)
      /subsystem=infinispan/remote-cache-container=web-sessions/remote-cluster=jdg-server-cluster:add(socket-bindings=[remote-jdg-server1,remote-jdg-server2])
      run-batch
      /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic:add()
      /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic/store=hotrod:add(remote-cache-container=web-sessions, fetch-state=false, preload=false, passivation=false, purge=false, shared=false)
      /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic/component=locking:add(isolation=REPEATABLE_READ)
      /subsystem=infinispan/cache-container=web/invalidation-cache=offload_ic/component=transaction:add(mode=BATCH)
      /subsystem=infinispan/cache-container=web:write-attribute(name=default-cache, value=offload_ic)
      

      The bad thing is despite of the fact we don't fail nodes in stress tests, we have exceptions.
      These exceptions make performace results useless, hence they could mask an actual performance issue.

      EAP and JDG log files are attached;
      Complete test run here;

        1. clusterbench-ee8.ear
          65 kB
          Tommaso Borgato
        2. wlf_20204230-114229-jdg-service-1-clustered.xml
          26 kB
          Tommaso Borgato
        3. wlf_20204230-114229-jdg-service-1-server.log
          578 kB
          Tommaso Borgato
        4. wlf_20204230-114229-jdg-service-2-server.log
          582 kB
          Tommaso Borgato
        5. wlf_20204230-114229-wildfly-service-1-server.log
          29.76 MB
          Tommaso Borgato
        6. wlf_20204230-114229-wildfly-service-1-standalone-ha.xml
          34 kB
          Tommaso Borgato
        7. wlf_20204230-114229-wildfly-service-2-server.log
          13.50 MB
          Tommaso Borgato
        8. wlf_20204230-114229-wildfly-service-3-server.log
          14.30 MB
          Tommaso Borgato

              pferraro@redhat.com Paul Ferraro
              tborgato@redhat.com Tommaso Borgato
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: