Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 3.5
Affects Version/s: 3.4
Labels:
None

Bugzilla References:
https://bugzilla.redhat.com/show_bug.cgi?id=1017376
Workaround:

Workaround Exists
Workaround Description:

Hide

It appears that setting MFC and UFC max_credits to 10M or removing these protocols at all is a workaround for this issue.

Show
It appears that setting MFC and UFC max_credits to 10M or removing these protocols at all is a workaround for this issue.

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

I have recently observed a repeated situation where many (or all) threads have been stuck waiting for credits in FlowControl protocol.

The credit request was not handled on the other node as this is non-oob message and some (actually many of them - cause unknown) messages before the request have been lost - therefore the request was waiting for them to be re-sent.

However, these have not been re-sent properly as the retransmission request was not received - all OOB threads were stuck in the FlowControl protocol as these handled some other request and tried to send a response - but the response could not be sent until FlowControl gets the credits.

The probability of such situation could be lowered by tagging the credit request to be OOB - then it would be handled immediately. If the credit replenish message would then be processed in regular OOB pool, this could get already depleted by many requests, but setting up the internal thread pool would solve the problem.

Other consideration would be to allow releasing thread from FlowControl (let it send the message even without credits) if it waits there for too long.

Workaround

It appears that setting MFC and UFC max_credits to 10M or removing these protocols at all is a workaround for this issue.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

jgroups-udp-radim.xml
4 kB
2013/11/05 5:09 AM
RemoteGetStressTest.java
7 kB
2013/10/30 10:42 AM
UPerf2.java
31 kB
2013/11/05 5:11 AM

relates to

ISPN-3645 StateTransferLargeObjectTest hangs randomly

Closed

Assignee:: Bela Ban

Reporter:: Radim Vansa (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2013/08/12 4:15 AM

Updated:: 2021/10/24 6:14 AM

Resolved:: 2013/12/10 8:49 AM

Details

Description

Workaround

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates