-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
None
Split may cause deadlock in such case where there are tons of jobs waiting to run that have split elements. if these jobs arrived to execution of split simultaneously, they may run out of number of threads that allowed to batch runtime. then parent thread will be frozen at invocation of latch.await() in SplitExecutionRunner#run() and child threads that submitted at CompositeExecutionRunner#runFlow() will never be started because there is no vacancy in the thread pool.
It seems to that an another approach which is avoiding use of CountDownLatch is needed because it consumes a thread while waiting for completion of child flows. it would be something like, leave responsibility to child threads that checking whether all of execution of child steps are completed or not.
An another possible option is that revert of timeout of latch.await(). it should be brought by an extra property like jberet.local-tx because timeout of split was JBeret specific.
Original discussion in related issue: https://issues.jboss.org/browse/JBERET-54
- is related to
-
WFLY-13357 (Regression) Execution of concurrent batch jobs containg partitioned steps causes deadlock
- Closed
- relates to
-
JBERET-180 Execution of concurrent batch jobs containg partitioned steps causes deadlock
- Resolved
-
JBERET-54 Split doesn't wait for child steps done over 300 seconds
- Resolved