Loading...

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 5.1.0.ALPHA1, 5.1.0.FINAL
Affects Version/s: 5.0.0.FINAL
Component/s: State Transfer
Labels:
None

Git Pull Request:
https://github.com/infinispan/infinispan/pull/500

The coordinator of a cluster, which is the first node of the cluster can end up trying to fetch state from other nodes needlessly. For example:

1. A node starts up:

15:39:20,443 DEBUG [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-2) 
New view accepted: [michal-linhard-12702|0] [michal-linhard-12702]

2. Before state transfer check happens, a new node joins:

15:39:20,735 DEBUG [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (pool-5-thread-29) 
New view accepted: [michal-linhard-12702|1] [michal-linhard-12702, michal-linhard-37465, michal-linhard-61619]

3. Now comes the coordinator which skips itself and sends a state trasnfer req to michal-linhard-37465:

15:39:20,902 INFO  [org.infinispan.remoting.rpc.RpcManagerImpl] (MSC service thread 1-4) 
ISPN000074: Trying to fetch state from michal-linhard-37465

4. That's not right, cos 37465 is likely not gonna have anything in memory and this could potentially lead to deadlocks, where 37465 starts and request state from 12702, and in fact, that's what happens:

15:39:22,611 INFO  [org.infinispan.remoting.rpc.RpcManagerImpl] (MSC service thread 1-4) 
ISPN000074: Trying to fetch state from michal-linhard-12702

5. In the mean time, as expected, 37465 writes nothing

15:39:22,710 DEBUG [org.infinispan.statetransfer.StateTransferManagerImpl] (STREAMING_STATE_TRANSFER-sender-1,default,michal-linhard-37465) 
Writing 0 StoredEntries to stream
...
15:39:22,806 TRACE [org.infinispan.transaction.TransactionLog] (STREAMING_STATE_TRANSFER-sender-1,default,michal-linhard-37465) 
Writing 0 pending prepares to the stream

In other words, in the current design, the coordinator should not go around asking for state.

relates to

ISPN-1317 Concurrent state transfer requests can lead to premature flush wait closures

Resolved

Assignee:: Galder Zamarreño

Reporter:: Galder Zamarreño

Archiver:: Amol Dongare

Created:: 2011/08/08 6:19 AM

Updated:: 2020/09/14 5:38 AM

Resolved:: 2011/08/08 12:25 PM

Archived:: 2024/11/28 6:21 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty