-
Bug
-
Resolution: Done
-
Major
-
jBPM 4.4
-
None
-
None
DispatcherThread waits before retrying to execute a Job when there is an error. If there are multiple failures in a row, the wait period is increased in the following manner:
int currentIdleInterval = jobExecutor.getIdleMillis(); //Typically a few seconds
while(isActive) {
....
catch (Exception e)
}
In other words, currentIdleInterval grows without bounds. Eventually the integer currentIdleInterval will overflow and semaphore.wait() will throw java.lang.IllegalArgumentException: timeout value is negative, resulting in a crash.
In my case an error occurs because there are timers that refer to inactive executions (because timers that have not fired are not deleted when a subprocess ends in JBPM 4.3). This has left me with some 340 rows in JBPM4_JOB that refers to a non-existing execution.
Another aspect of this problem is that the retry-wait can grow to unreasonably large values, for example multiple days, and if my calculations are correct, multiple years.
A simple fix is to introduce an upper bound of for example 256 * jobExecutor.getIdleMillis(), and not increase currentIdleInterval above that.