Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-19007

Clustering; Occasional org.infinispan.util.concurrent.TimeoutException during various failover scenarios

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • 7.3.0.GA.CR4
    • Clustering
    • None
    • Hide
      After re-run:

      In [eap-7.x-clustering-http-session-encrypt-asym/68|https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clustering-http-session-encrypt-asym/68] all org.infinispan.util.concurrent.TimeoutException follow the failure of some node, then it's explained:

      * 2020-04-06 07:56:52,325 node1 start
      * 2020-04-06 07:58:04,440 node2 start
      * 2020-04-06 07:58:48,213 node3 start
      * 2020-04-06 07:59:33,289 node4 start
      * 2020-04-06 08:02:34,035 node1 kill
      * 2020-04-06 08:02:45,487 node3 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:02:45,919 node4 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:02:46,501 node2 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:04:09,641 node1 start
      * 2020-04-06 08:06:01,005 node2 kill
      * 2020-04-06 08:06:11,879 node3 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:06:12,142 node4 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:06:12,850 node1 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:07:35,907 node2 start
      * 2020-04-06 08:09:26,107 node3 kill
      * 2020-04-06 08:11:00,938 node3 start
      * 2020-04-06 08:11:12,288 node4 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:12:52,102 node4 kill
      * 2020-04-06 08:12:54,554 node3 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:14:27,013 node4 start
      * 2020-04-06 08:17:20,230 node1 stop
      * 2020-04-06 08:17:21,202 node2 org.infinispan.util.concurrent.TimeoutException
      * 2020-04-06 08:17:37,590 node2 stop
      * 2020-04-06 08:17:53,703 node3 stop
      * 2020-04-06 08:18:10,847 node4 stop


      No org.infinispan.util.concurrent.TimeoutException in [eap-7.x-clustering-ejb-ejbservlet-shutdown-dist#65|https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clustering-ejb-ejbservlet-shutdown-dist/67]

      JBEAP-18343 open for [eap-7.x-clustering-db-session-shutdown-repl-db2-11.1|https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clustering-db-session-shutdown-repl-db2-11.1]

      hence, the current issue is resolved

      Show
      After re-run: In [eap-7.x-clustering-http-session-encrypt-asym/68| https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clustering-http-session-encrypt-asym/68 ] all org.infinispan.util.concurrent.TimeoutException follow the failure of some node, then it's explained: * 2020-04-06 07:56:52,325 node1 start * 2020-04-06 07:58:04,440 node2 start * 2020-04-06 07:58:48,213 node3 start * 2020-04-06 07:59:33,289 node4 start * 2020-04-06 08:02:34,035 node1 kill * 2020-04-06 08:02:45,487 node3 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:02:45,919 node4 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:02:46,501 node2 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:04:09,641 node1 start * 2020-04-06 08:06:01,005 node2 kill * 2020-04-06 08:06:11,879 node3 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:06:12,142 node4 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:06:12,850 node1 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:07:35,907 node2 start * 2020-04-06 08:09:26,107 node3 kill * 2020-04-06 08:11:00,938 node3 start * 2020-04-06 08:11:12,288 node4 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:12:52,102 node4 kill * 2020-04-06 08:12:54,554 node3 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:14:27,013 node4 start * 2020-04-06 08:17:20,230 node1 stop * 2020-04-06 08:17:21,202 node2 org.infinispan.util.concurrent.TimeoutException * 2020-04-06 08:17:37,590 node2 stop * 2020-04-06 08:17:53,703 node3 stop * 2020-04-06 08:18:10,847 node4 stop No org.infinispan.util.concurrent.TimeoutException in [eap-7.x-clustering-ejb-ejbservlet-shutdown-dist#65| https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clustering-ejb-ejbservlet-shutdown-dist/67 ] JBEAP-18343 open for [eap-7.x-clustering-db-session-shutdown-repl-db2-11.1| https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clustering-db-session-shutdown-repl-db2-11.1 ] hence, the current issue is resolved

    Description

      We observed occasional org.infinispan.util.concurrent.TimeoutException in various scenarios: we report here some relevant runs;

      The error is:

      2020-03-17 19:26:59,746 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (remote-thread--p7-t15) ISPN000136: Error executing command LockControlCommand on Cache 'clusterbench-ee8.ear.clusterbench-ee8-web.war', writing keys []: org.infinispan.util.concurrent.TimeoutException: Could not acquire lock on SessionCreationMetaDataKey(wVyuFm8u_IL8zHnKm4Ao2LPvtB2q4clJmzlo8saQ) in behalf of transaction GlobalTx:wildfly4:261970. Current owner GlobalTx:wildfly1:92301.
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.timeout(DefaultPendingLockManager.java:252)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.awaitOn(DefaultPendingLockManager.java:337)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.awaitPendingTransactionsForAllKeys(DefaultPendingLockManager.java:137)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.checkPendingAndLockAllKeys(AbstractTxLockingInterceptor.java:153)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAllOrRegisterBackupLock(AbstractTxLockingInterceptor.java:122)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.locking.PessimisticLockingInterceptor.localLockCommandWork(PessimisticLockingInterceptor.java:208)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.locking.PessimisticLockingInterceptor.lambda$new$0(PessimisticLockingInterceptor.java:46)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.invokeNextThenApply(BaseAsyncInterceptor.java:81)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.locking.PessimisticLockingInterceptor.visitLockControlCommand(PessimisticLockingInterceptor.java:191)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:117)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.invokeNextAndHandle(BaseAsyncInterceptor.java:183)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.impl.TxInterceptor.visitLockControlCommand(TxInterceptor.java:223)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:117)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.invokeNext(BaseAsyncInterceptor.java:54)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.lambda$new$0(BaseAsyncInterceptor.java:22)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.InvocationSuccessFunction.apply(InvocationSuccessFunction.java:25)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.addCallback(SimpleAsyncInvocationStage.java:70)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.InvocationStage.thenApply(InvocationStage.java:45)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.asyncInvokeNext(BaseAsyncInterceptor.java:224)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitCommand(TransactionSynchronizerInterceptor.java:46)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.invokeNextAndHandle(BaseAsyncInterceptor.java:185)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.statetransfer.StateTransferInterceptor.visitLockControlCommand(StateTransferInterceptor.java:90)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:117)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.invokeNextAndExceptionally(BaseAsyncInterceptor.java:123)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.impl.InvocationContextInterceptor.visitCommand(InvocationContextInterceptor.java:90)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.invokeNext(BaseAsyncInterceptor.java:56)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.impl.BatchingInterceptor.handleDefault(BatchingInterceptor.java:61)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.DDAsyncInterceptor.visitLockControlCommand(DDAsyncInterceptor.java:160)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:117)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.BaseAsyncInterceptor.invokeNext(BaseAsyncInterceptor.java:54)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.DDAsyncInterceptor.handleDefault(DDAsyncInterceptor.java:54)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.DDAsyncInterceptor.visitLockControlCommand(DDAsyncInterceptor.java:160)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:117)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.DDAsyncInterceptor.visitCommand(DDAsyncInterceptor.java:50)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invokeAsync(AsyncInterceptorChainImpl.java:234)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.commands.control.LockControlCommand.invokeAsync(LockControlCommand.java:126)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokeCommand(BasePerCacheInboundInvocationHandler.java:117)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.invoke(BaseBlockingRunnable.java:99)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.runAsync(BaseBlockingRunnable.java:71)
      	at org.infinispan@9.4.16.Final-redhat-00002//org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.run(BaseBlockingRunnable.java:40)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at org.wildfly.clustering.service@7.3.0.GA-redhat-00004//org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
      	at java.base/java.lang.Thread.run(Thread.java:834)
      

      1. Cluster communication secured with Asymmetric Encryption: e.g. in run eap-7.x-clustering-http-session-encrypt-asym#61

      EAP nodes configured as follows:

      	embed-server --server-config=standalone-ha.xml
      	/subsystem=infinispan/cache-container=web/distributed-cache=testDist:add()
      	/subsystem=infinispan/cache-container=web/distributed-cache=testDist/component=locking:write-attribute(name=isolation, value=REPEATABLE_READ)
      	/subsystem=infinispan/cache-container=web/distributed-cache=testDist/component=transaction:write-attribute(name=mode, value=BATCH)
      	/subsystem=infinispan/cache-container=web:write-attribute(name=default-cache, value=testDist)
      	# ASYM_ENCRYPT: Configured with algorithms and key sizes: secret key is generated and distributed by coordinator
      	batch
      	/subsystem=jgroups/stack=udp/protocol=AUTH:add(add-index=8)
      	/subsystem=jgroups/stack=udp/protocol=AUTH/token=digest:add(algorithm=SHA-512, shared-secret-reference={clear-text=123PIPPOBAUDO})
      	/subsystem=jgroups/stack=tcp/protocol=AUTH:add(add-index=8)
      	/subsystem=jgroups/stack=tcp/protocol=AUTH/token=digest:add(algorithm=SHA-512, shared-secret-reference={clear-text=123PIPPOBAUDO})
      	run-batch
      	/subsystem=jgroups/stack=udp/protocol=ASYM_ENCRYPT:add(add-index=5,properties={sym_keylength=128,sym_algorithm=AES/ECB/PKCS5Padding,asym_keylength=512,asym_algorithm=RSA})
      	# we had an error with UDP on PSI
      	/subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcp)
      	

      Events:

      • 2020-03-20 09:57:16,680: node1 EAP killed
      • 2020-03-20 09:58:51,506: node1 EAP re-started
      • 2020-03-20 10:00:44,089: node2 EAP killed
      • 2020-03-20 10:02:18,615: node2 EAP re-started
      • 2020-03-20 10:04:11,056: node3 EAP killed
      • 2020-03-20 10:05:45,574: node3 EAP re-started
      • 2020-03-20 10:07:37,661: node4 EAP killed
      • 2020-03-20 10:07:40,114: node3 org.infinispan.util.concurrent.TimeoutException: Could not acquire lock on SessionCreationMetaDataKey(vL7kRsSBows27G4BTdSh8FFByp6FhDEl7pbrxect) in behalf of transaction GlobalTx:wildfly2:20334. Current owner GlobalTx:wildfly1:43920
      • 2020-03-20 10:09:12,170: node4 EAP re-started

      Complete logs:

      2. HTTP Session data cache backed by a DB2 Relational Database: e.g. in run eap-7.x-clustering-db-session-shutdown-repl-db2-11.1#54

      EAP nodes configured as follows:

      	embed-server --server-config=standalone-ha.xml
      	/subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcp)
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload:add()
      	/subsystem=infinispan/cache-container=web/invalidation-cache==offload/component=transaction:write-attribute(name=mode, value=BATCH)
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/component=locking:write-attribute(name=isolation, value=REPEATABLE_READ)
      	data-source add --name=testDS --jndi-name=java:jboss/datasources/testDS --driver-name=db2-connector.jar --connection-url=jdbc:db2://db2-111.hosts.mwqe.eng.bos.redhat.com:50000/dballo --enabled=true --jta=true --use-java-context=true --transaction-isolation=TRANSACTION_READ_COMMITTED --min-pool-size=1 --max-pool-size=5 --pool-prefill=true --user-name=dballo00 --password=dballo00 --prepared-statements-cache-size=32 --share-prepared-statements=true
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc:add(data-source=testDS,fetch-state=false,passivation=false,purge=false,shared=true,dialect=DB2){allow-resource-service-restart=true}
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc/table=string:write-attribute(name=prefix,value=s)
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc/table=string:write-attribute(name=id-column.name,value=id)
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc/table=string:write-attribute(name=id-column.type,value=VARCHAR(255))
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc/table=string:write-attribute(name=data-column.name,value=datum)
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc/table=string:write-attribute(name=data-column.type,value=BLOB)
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc/table=string:write-attribute(name=timestamp-column.name,value=version)
      	/subsystem=infinispan/cache-container=web/invalidation-cache=offload/store=jdbc/table=string:write-attribute(name=timestamp-column.type,value=BIGINT)
      	/subsystem=infinispan/cache-container=web:write-attribute(name=default-cache, value=offload)
      	/subsystem=transactions:write-attribute(name=node-identifier,value=wildfly1)
      	

      Events:

      • 2020-03-18 01:20:56,682: node1 EAP shut-down
      • 2020-03-18 01:22:45,533: node1 EAP re-started
      • 2020-03-18 01:24:37,725: node2 EAP shut-down
      • 2020-03-18 01:26:26,344: node2 EAP re-started
      • 2020-03-18 01:27:46,235: node1 org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 0 milliseconds for key SessionCreationMetaDataKey(hY6oPytNd1WOD5gL_CGAD4o5dvUrgqYVvv0usGSV) and requestor GlobalTx:wildfly3:3878. Lock is held by GlobalTx:wildfly4:3680
      • 2020-03-18 01:28:20,306: node3 EAP shut-down
      • 2020-03-18 01:30:09,061: node3 EAP re-started
      • 2020-03-18 01:31:58,511: node4 EAP shut-down
      • 2020-03-18 01:33:47,094: node4 EAP re-started
      • 2020-03-18 01:35:11,735: node4 org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 0 milliseconds for key SessionCreationMetaDataKey(16KjFub5uG5W8_mEgH9bfTYnfSd3Kp5Zi3FBKqpX) and requestor GlobalTx:wildfly1:5242. Lock is held by GlobalTx:wildfly2:3495

      Complete logs:

      3. Both EJB and Servlet session data stored in distributed caches: e.g. eap-7.x-clustering-ejb-ejbservlet-shutdown-dist#59

      EAP nodes configured as follows:

      	embed-server --server-config=standalone-ha.xml
      	/subsystem=jgroups/stack=udp/transport=UDP/property=ip_ttl:add(value=32)
      	/subsystem=jgroups:write-attribute(name=default-stack,value=tcp)
      	/subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcp)
      	/subsystem=infinispan/cache-container=web/distributed-cache=testDistWeb:add()
      	/subsystem=infinispan/cache-container=web/distributed-cache=testDistWeb/component=locking:write-attribute(name=isolation, value=REPEATABLE_READ)
      	/subsystem=infinispan/cache-container=web/distributed-cache=testDistWeb/component=transaction:write-attribute(name=mode, value=BATCH)
      	/subsystem=infinispan/cache-container=web:write-attribute(name=default-cache, value=testDistWeb)
      	/subsystem=infinispan/cache-container=ejb/distributed-cache=testDistEjb:add()
      	/subsystem=infinispan/cache-container=ejb/distributed-cache=testDistEjb/component=locking:write-attribute(name=isolation, value=REPEATABLE_READ)
      	/subsystem=infinispan/cache-container=ejb/distributed-cache=testDistEjb/component=transaction:write-attribute(name=mode, value=BATCH)
      	/subsystem=infinispan/cache-container=ejb:write-attribute(name=default-cache, value=testDistEjb)
      	

      Events:

      • 2020-03-17 19:21:20,538: node1 EAP shut-down
      • 2020-03-17 19:23:09,309: node1 EAP re-started
      • 2020-03-17 19:25:02,173: node2 EAP shut-down
      • 2020-03-17 19:26:50,843: node2 EAP re-started
      • 2020-03-17 19:26:59,746: node2 org.infinispan.util.concurrent.TimeoutException: Could not acquire lock on SessionCreationMetaDataKey(wVyuFm8u_IL8zHnKm4Ao2LPvtB2q4clJmzlo8saQ) in behalf of transaction GlobalTx:wildfly4:261970. Current owner GlobalTx:wildfly1:92301.
      • 2020-03-17 19:28:44,506: node3 EAP shut-down
      • 2020-03-17 19:30:33,284: node3 EAP re-started
      • 2020-03-17 19:32:24,706: node4 EAP shut-down
      • 2020-03-17 19:34:13,251: node4 EAP re-started
      • 2020-03-18 01:35:11,735: node3 org.infinispan.util.concurrent.TimeoutException: Could not acquire lock on SessionCreationMetaDataKey(wVyuFm8u_IL8zHnKm4Ao2LPvtB2q4clJmzlo8saQ) in behalf of transaction GlobalTx:wildfly4:261970. Current owner GlobalTx:wildfly1:92301.

      Complete logs:

      Attachments

        Activity

          People

            pferraro@redhat.com Paul Ferraro
            tborgato@redhat.com Tommaso Borgato
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: