When a message is received by UNICAST3 or NAKACK2 and passed up to the application, but the app in turn sends large amounts of data down, then it may block in the flow control protocol (UFC or MFC).
We did remove ignore_sync_response (in https://issues.jboss.org/browse/JGRP-1655), which would let sync responses pass through flow control, assuming that credit responses would be received as OOBs.
However, the issue with credits being received as part of message batches is that they're received as internal batches, not OOB batches.
This means that they are delivered sequentially (since they're from the same sender) and are thus not applied until after the blocking request returns, which is not the case: deadlock !
The reason is that credits are marked as INTERNAL (and also OOB), but the code which reads batches adds all INTERNAL|OOB messages to the internal batch, which is processed by the internal pool, but since the batch itself is not tagged as OOB, the messages will not get processed immediately.
SOLUTION: treat INTERNAL the same as OOB, e.g. check for OOB and INTERNAL:
if(batch.mode() == INTERNAL || batch.mode() == OOB)
An alternative would be to revisit the idea of the internal thread pool: is it really needed ? And possibly get rid of it.
Git bisect shows the faulty commit on July 16:
[linux]/home/bela/JGroups$ git bisect good 87cf70d936bb3b15860bf4cd89fe2fff49e85d1e is the first bad commit commit 87cf70d936bb3b15860bf4cd89fe2fff49e85d1e Author: Bela Ban <belaban@yahoo.com> Date: Tue Jul 16 14:51:10 2013 +0200 Removed ignore_synchronous_thread and ignore_thread from FlowControl; not needed anymore as we cannot block anymore on credit responses (https://issues.jboss.org/browse/JGRP-1655) :040000 040000 d0e4647ab4b89d320e45a460f0385311e2e4eba0 fefac0eb21ab3609ccf440eec9ec8f659a819f6b M src
- duplicates
-
JGRP-1655 ThreadLocal leaks discovered after enabling Tomcat 7 ThreadLocal detection listener
- Resolved