Timer thread potentially misses to unlock engine locks of PerProcessInstanceRuntimeManager.
https://github.com/droolsjbpm/jbpm/blob/6.5.0.Final/jbpm-flow/src/main/java/org/jbpm/process/core/timer/impl/GlobalTimerService.java#L227-L244
https://github.com/droolsjbpm/jbpm/blob/6.5.0.Final/jbpm-persistence-jpa/src/main/java/org/jbpm/persistence/timer/GlobalJpaTimerJobInstance.java#L65-L94
If an Exception is thrown from RuntimeEngineImpl.getKieSession() in a timer thread, it will not be handled by the "((DisposableCommandService) commandService).dispose()" in the finally block because commandService == null. Hence disposeRuntimeEngine() is not called so the engine lock of PerProcessInstanceRuntimeManager is not unlocked. It will block a future access to the same process instance from a different thread.
An example of "Exception is thrown from RuntimeEngineImpl.getKieSession()":
2017-02-16T16:17:50.358Z WARN 0 [pool-42-thread-1] com.arjuna.ats.arjuna: ARJUNA012125: TwoPhaseCoordinator.beforeCompletion - failed for SynchronizationImple< 0:ffffc0a8c805:1c43ef73:58a530e5:13b42d53, org.drools.persistence.jta.JtaTransactionSynchronizationAdapter@1a2e9b52 >: javax.persistence.OptimisticLockException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect): [org.drools.persistence.info.SessionInfo#633004] at org.hibernate.ejb.AbstractEntityManagerImpl.wrapStaleStateException(AbstractEntityManagerImpl.java:1416) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1329) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1310) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1316) at org.hibernate.ejb.AbstractEntityManagerImpl.flush(AbstractEntityManagerImpl.java:999) at org.drools.persistence.TriggerUpdateTransactionSynchronization.beforeCompletion(TriggerUpdateTransactionSynchronization.java:72) at org.drools.persistence.jta.JtaTransactionSynchronizationAdapter.beforeCompletion(JtaTransactionSynchronizationAdapter.java:54) at com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.beforeCompletion(SynchronizationImple.java:76) at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.beforeCompletion(TwoPhaseCoordinator.java:273) at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:93) at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162) at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1211) at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126) at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75) at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:173) at org.drools.persistence.jta.JtaTransactionManager.commit(JtaTransactionManager.java:236) at org.drools.persistence.SingleSessionCommandService.<init>(SingleSessionCommandService.java:196) at sun.reflect.GeneratedConstructorAccessor230.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.drools.persistence.jpa.KnowledgeStoreServiceImpl.buildCommandService(KnowledgeStoreServiceImpl.java:143) at org.drools.persistence.jpa.KnowledgeStoreServiceImpl.loadKieSession(KnowledgeStoreServiceImpl.java:111) at org.drools.persistence.jpa.KnowledgeStoreServiceImpl.loadKieSession(KnowledgeStoreServiceImpl.java:39) at org.kie.internal.persistence.jpa.JPAKnowledgeService.loadStatefulKnowledgeSession(JPAKnowledgeService.java:144) at org.jbpm.runtime.manager.impl.factory.JPASessionFactory.findKieSessionById(JPASessionFactory.java:53) at org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager$PerProcessInstanceInitializer.initKieSession(PerProcessInstanceRuntimeManager.java:553) at org.jbpm.runtime.manager.impl.RuntimeEngineImpl.getKieSession(RuntimeEngineImpl.java:70) at org.jbpm.process.core.timer.impl.GlobalTimerService.getCommandService(GlobalTimerService.java:233) at org.jbpm.process.core.timer.impl.GlobalTimerService.getCommandService(GlobalTimerService.java:208) at org.jbpm.persistence.timer.GlobalJpaTimerJobInstance.call(GlobalJpaTimerJobInstance.java:75) at org.jbpm.persistence.timer.GlobalJpaTimerJobInstance.call(GlobalJpaTimerJobInstance.java:47) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect): [org.drools.persistence.info.SessionInfo#633004] at org.hibernate.persister.entity.AbstractEntityPersister.check(AbstractEntityPersister.java:2521) at org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:3240) at org.hibernate.persister.entity.AbstractEntityPersister.updateOrInsert(AbstractEntityPersister.java:3138) at org.hibernate.persister.entity.AbstractEntityPersister.update(AbstractEntityPersister.java:3468) at org.hibernate.action.internal.EntityUpdateAction.execute(EntityUpdateAction.java:140) at org.hibernate.engine.spi.ActionQueue.execute(ActionQueue.java:395) at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:387) at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:304) at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:349) at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:56) at org.hibernate.internal.SessionImpl.flush(SessionImpl.java:1195) at org.hibernate.ejb.AbstractEntityManagerImpl.flush(AbstractEntityManagerImpl.java:996) ... 32 more
============
Attached a reproducer (01791456_lock_leak_in_timer.zip). Reproduced with 6.5.0.Final-redhat-2 and 7.0.0-SNAPSHOT.
It intentionally throws a RuntimeException in RuntimeEngineImpl.getKieSession() in a timer thread. If you run "mvn test", you will see output like this:
2017-02-20 12:59:29,148 INFO [main] [org.drools.persistence.TransactionManagerFactory] Using org.drools.persistence.jta.JtaTransactionManagerFactory@2274160 Installed rule using default helper : throwExOnInitializer Enter User Task A process instance started : pid = 1 ### Throwing Exception on : pool-2-thread-1 java.lang.RuntimeException: ERROR! at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.jboss.byteman.rule.expression.ThrowExpression.interpret(ThrowExpression.java:230) at org.jboss.byteman.rule.Action.interpret(Action.java:144) at org.jboss.byteman.rule.helper.InterpretedHelper.fire(InterpretedHelper.java:171) at org.jboss.byteman.rule.helper.InterpretedHelper.execute0(InterpretedHelper.java:139) at org.jboss.byteman.rule.helper.InterpretedHelper.execute(InterpretedHelper.java:101) at org.jboss.byteman.rule.Rule.execute(Rule.java:684) at org.jboss.byteman.rule.Rule.execute(Rule.java:653) at org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager$PerProcessInstanceInitializer.initKieSession(PerProcessInstanceRuntimeManager.java) at org.jbpm.runtime.manager.impl.RuntimeEngineImpl.getKieSession(RuntimeEngineImpl.java:70) at org.jbpm.process.core.timer.impl.GlobalTimerService.getCommandService(GlobalTimerService.java:233) at org.jbpm.process.core.timer.impl.GlobalTimerService.getCommandService(GlobalTimerService.java:208) at org.jbpm.persistence.timer.GlobalJpaTimerJobInstance.call(GlobalJpaTimerJobInstance.java:75) at org.jbpm.persistence.timer.GlobalJpaTimerJobInstance.call(GlobalJpaTimerJobInstance.java:47) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) sleep done
Then, it gets hung. If you capture a thread dump by jstack, you will see that main thread is blocked at PerProcessInstanceRuntimeManager.createLockOnGetEngine().
"main" #1 prio=5 os_prio=0 tid=0x00007fa328009000 nid=0x76b3 waiting on condition [0x00007fa32e3d7000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000075bd62fb0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager.createLockOnGetEngine(PerProcessInstanceRuntimeManager.java:593) at org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager.getRuntimeEngine(PerProcessInstanceRuntimeManager.java:155) at com.sample.ProcessJPATest.testProcess(ProcessJPATest.java:106)
NOTE: This reproducer will throw the RuntimeException even if the lock issue is fixed. I just wrote this to confirm the "block" at createLockOnGetEngine(). There may be room to improve in the byteman rule...
- is cloned by
-
RHBPMS-4632 [GSS] (6.4.z) Engine lock leak of PerProcessInstanceRuntimeManager in Timer thread
- Verified