-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
7.72.0.Final, 7.73.0.Final, 7.74.0.Final, 7.74.1.Final
-
None
-
False
-
None
-
False
-
NEW
-
NEW
-
---
-
---
-
Important
Hello,
I have identified a couple of cases where operations such as completing a task, but generally all those related to timers, can experience delays of several seconds depending on the number of timers in the database.
Although the underlying issue seems to be the synchronized code blocks within WildFly/JBoss, the fact is that each time WildFly/JBoss refreshes the timers from the database, everything requiring access to timers is severely impacted, with delays ranging from a few seconds to more than 20 seconds, depending on the number of timers in the database and those involved in the related operation. For example, in a complete task operation with a database timer volume exceeding 200K, if the task has an SLA and an associated timer, the operation can take over 10 seconds. If more timers are associated, under certain concurrency, it can take over 20 seconds. The high delays coincide with the interval configured by the refresh-interval parameter (default is 5 minutes). Changing the refresh-interval to 0 is not a viable solution, as the configuration is in a cluster
However, I am opening a case here because at least for the complete task operation:
The whole problem lies in the fact that the task is being completed in the same thread where we wait for the timer operation to finish. But this is not necessary, since in the implementation, we can see that this step is only executed after the transaction (commit) has been completed, so it does not need to wait in the same thread to return the complete response.
In fact, it is perfectly fine to mark the timer for removal only after receiving the correct commit for the rest of the complete operation. The problem is doing it in the same thread. Even if the timer triggers at the moment between completion and removal, it would simply raise an error, and I understand that this is already handled.
Thank you for your attention. I look forward to your recommendations.
- is caused by
-
JBEAP-27772 [GSS](7.4.z) WFCORE-6963 - AbstractModelResource$DefaultResourceProvider.hasChildren inefficiency degrades with child count
- Verified
-
JBEAP-27774 [GSS](7.4.z) WFLY-19681 - DatabaseTimerPersistence$RefreshTask can delay other threads' timer additions or removals when detecting many Timer removals from the database
- Verified