Loading...

XML

Word

Printable

Type: Bug
Resolution: Obsolete
Priority: Major
Fix Version/s: No Release
Affects Version/s: JBossAS-5.1.0.GA
Component/s: Clustering
Labels:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

JBC is different from JBC 1.4 in how it handles suspected nodes during data gravitation. JBC would ignore them; w/ JBC 3 they propagate.

These can happen easily with a cluster under load and a node failing. LB detects failure before view changes, node that has failing node as it backup starts gravitating, replication of the gravitated data to the (failed) backup throws a SuspectException.

The clustering integration needs to handle this better. Right now gravitation attempts are wrapped in txs, so the SuspectException fails the tx commit. That's pretty non-recoverable unless we catch the commit failure and retry. A possibility is to not wrap the gravitation in a tx (not really needed except for FIELD) and use JBossCacheWrapper's get() retry logic to redo the gravitation.

We already catch the exception and allow the request to continue if the data was actually retrieved. Actually, that only works because we wrap w/ the tx; JBC wouldn't return from the gravitation read without the tx causing the replication write to wait for tx commit. Hmm...

Problem this causes now is 1) gravitated data doesn't replicate to buddy until a request changes it and causes a normal write 2) DataGravitationCleanupCommand is not issued, so stale data is left in the cache. Some of the changes made for JBCACHE-1530 reduce the likelihood of that stale data being used; it's only used if a request fails over to the node where it's stored, leading to gravitation from (stale) local backup tree.

Assignee:: Brian Stansberry

Reporter:: Brian Stansberry

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Created:: 2009/08/18 1:00 PM

Updated:: 2011/03/22 1:43 PM

Resolved:: 2011/03/22 1:43 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates