If 2 Triggers for the same Job (e.g. StartTimerEvent with timeCycle) are executed with short interval, the Job/Trigger could be deleted in Quartz tables so StartTimerEvent is no longer triggered.
Try attached reproducer 01793010_quartz_concurrent_mysql_BPMS640.zip
- Edit StartTimerTest.java and quartz-definition.properties for your database settings
- Create Quartz tables. This test makes sure to truncate the tables on test.
- mvn clean test
- This test uses Byteman to simulate the race condition that Worker-1 is executing JobStoreSupport.triggeredJobComplete() and Worker-2 is executing jBPM process.
2017-04-26 17:45:43,293[jBPMClusteredScheduler_Worker-1] --- Process started : piid = 1 2017-04-26 17:45:43,330 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is desired by: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,331 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is being obtained: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,332 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' given to: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,335 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' returned by: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,336 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is desired by: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,337 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is being obtained: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,340 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' given to: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,345 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' returned by: jBPMClusteredScheduler_Worker-1 2017-04-26 17:45:43,382 DEBUG [jBPMClusteredScheduler_Worker-1] [org.jbpm.persistence.timer.GlobalJpaTimerJobInstance] Allowed to dispose command service from global timer job instance 2017-04-26 17:45:43,383 DEBUG [jBPMClusteredScheduler_Worker-1] [org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager] Removing lock java.util.concurrent.locks.ReentrantLock@6acc79da[Locked by thread jBPMClusteredScheduler_Worker-1] from list as non is waiting for it by org.jbpm.runtime.manager.impl.RuntimeEngineImpl@22026f95 2017-04-26 17:45:43,383 DEBUG [jBPMClusteredScheduler_Worker-1] [org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager] java.util.concurrent.locks.ReentrantLock@6acc79da[Unlocked] unlocked by org.jbpm.runtime.manager.impl.RuntimeEngineImpl@22026f95 [jBPMClusteredScheduler_Worker-1] triggeredJobComplete() entry. Wait for next job 2017-04-26 17:45:54,853 DEBUG [QuartzScheduler_jBPMClusteredScheduler-tkobayas.nrt.csb1493196332781_ClusterManager] [org.quartz.impl.jdbcjobstore.JobStoreCMT] ClusterManager: Check-in complete. 2017-04-26 17:46:07,096 DEBUG [jBPMClusteredScheduler_QuartzSchedulerThread] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is desired by: jBPMClusteredScheduler_QuartzSchedulerThread 2017-04-26 17:46:07,097 DEBUG [jBPMClusteredScheduler_QuartzSchedulerThread] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is being obtained: jBPMClusteredScheduler_QuartzSchedulerThread 2017-04-26 17:46:07,098 DEBUG [jBPMClusteredScheduler_QuartzSchedulerThread] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' given to: jBPMClusteredScheduler_QuartzSchedulerThread 2017-04-26 17:46:07,108 DEBUG [jBPMClusteredScheduler_QuartzSchedulerThread] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' returned by: jBPMClusteredScheduler_QuartzSchedulerThread 2017-04-26 17:46:07,108 DEBUG [jBPMClusteredScheduler_QuartzSchedulerThread] [org.quartz.simpl.SimpleJobFactory] Producing instance of Job 'jbpm.StartProcess-com.sample.bpmn.hello-1', class=org.jbpm.process.core.timer.impl.QuartzSchedulerService$QuartzJob 2017-04-26 17:46:07,109 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.core.JobRunShell] Calling execute on job jbpm.StartProcess-com.sample.bpmn.hello-1 2017-04-26 17:46:07,118 DEBUG [jBPMClusteredScheduler_Worker-2] [org.jbpm.process.core.timer.impl.QuartzSchedulerService] Scheduling timer with name StartProcess-com.sample.bpmn.hello-1 2017-04-26 17:46:07,123 DEBUG [jBPMClusteredScheduler_Worker-2] [org.jbpm.process.core.timer.impl.GlobalTimerService] Returning timers [GlobalJpaTimerJobInstance [timerServiceId=default-per-pinstance-timerServiceId, getJobHandle()=GlobalQuartzJobHandle [jobName=StartProcess-com.sample.bpmn.hello-1, jobGroup=jbpm]]] for session 3 2017-04-26 17:46:07,132 DEBUG [jBPMClusteredScheduler_Worker-2] [org.jbpm.services.task.persistence.TaskTransactionInterceptor] Instantiating JtaTransactionManager [jBPMClusteredScheduler_Worker-2] StartProcessJob.execute() entry. Signaling... [jBPMClusteredScheduler_Worker-1] wake up! 2017-04-26 17:46:07,150 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is desired by: jBPMClusteredScheduler_Worker-1 2017-04-26 17:46:07,150 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is being obtained: jBPMClusteredScheduler_Worker-1 2017-04-26 17:46:07,151 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' given to: jBPMClusteredScheduler_Worker-1 2017-04-26 17:46:07,156 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.JobStoreCMT] Deleting job: jbpm.StartProcess-com.sample.bpmn.hello-1 2017-04-26 17:46:07,157 DEBUG [jBPMClusteredScheduler_Worker-2] [org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager] [MaintainMappingListener] Lock java.util.concurrent.locks.ReentrantLock@7d807043[Locked by thread jBPMClusteredScheduler_Worker-2] created and stored in list by org.jbpm.runtime.manager.impl.RuntimeEngineImpl@68205b49 2017-04-26 17:46:07,160[jBPMClusteredScheduler_Worker-2] --- Process started : piid = 2 2017-04-26 17:46:07,161 DEBUG [jBPMClusteredScheduler_Worker-1] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' returned by: jBPMClusteredScheduler_Worker-1 2017-04-26 17:46:07,180 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is desired by: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,180 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is being obtained: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,180 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' given to: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,185 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' returned by: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,185 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is desired by: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,186 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is being obtained: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,186 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' given to: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,189 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' returned by: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,189 WARN [jBPMClusteredScheduler_Worker-2] [org.jbpm.process.core.timer.impl.QuartzSchedulerService] Job has already been scheduled, most likely running in cluster: Unable to store Trigger with name: 'StartProcess-com.sample.bpmn.hello-1_trigger' and group: 'jbpm', because one already exists with this identification. 2017-04-26 17:46:07,197 DEBUG [jBPMClusteredScheduler_Worker-2] [org.jbpm.persistence.timer.GlobalJpaTimerJobInstance] Allowed to dispose command service from global timer job instance 2017-04-26 17:46:07,199 DEBUG [jBPMClusteredScheduler_Worker-2] [org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager] Removing lock java.util.concurrent.locks.ReentrantLock@7d807043[Locked by thread jBPMClusteredScheduler_Worker-2] from list as non is waiting for it by org.jbpm.runtime.manager.impl.RuntimeEngineImpl@68205b49 2017-04-26 17:46:07,199 DEBUG [jBPMClusteredScheduler_Worker-2] [org.jbpm.runtime.manager.impl.PerProcessInstanceRuntimeManager] java.util.concurrent.locks.ReentrantLock@7d807043[Unlocked] unlocked by org.jbpm.runtime.manager.impl.RuntimeEngineImpl@68205b49 2017-04-26 17:46:07,204 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is desired by: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,204 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' is being obtained: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,204 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' given to: jBPMClusteredScheduler_Worker-2 2017-04-26 17:46:07,209 DEBUG [jBPMClusteredScheduler_Worker-2] [org.quartz.impl.jdbcjobstore.StdRowLockSemaphore] Lock 'TRIGGER_ACCESS' returned by: jBPMClusteredScheduler_Worker-2
I think the problem is here:
In a normal case, "trigger.getNextFireTime() == null" and "stat != null" and "stat.getNextFireTime() != null" so it doesn't fall into removeTrigger().
In this problem case, the second execution by Worker-2 has updated the stat (= quartz_triggers table) so Worker-1 falls into removeTrigger() (See "Deleting job:" log). So removed the Job/Trigger.
I guess it's a problem in Quartz side which allows to trigger a Job which is still in progress. But may be a problem of usage in jBPM side?
Please note that "short interval" is achieved by Byteman in this reproducer. In usual case, it could happen even with 1 minutes timeCycle StartTimerEvent when MisfireHandler is involved. Say, MisfireHandler amends a trigger time and the trigger is fired. Then a newly created Trigger may have very close NextFireTime.
With H2, I could reproduce "Deleting job:" log but somehow StartTimerEvent keeps working.
- clones
-
RHPAM-170 Job/Trigger for repeat Timer are deleted on short interval executions with Quartz
- Closed