Loading...

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: 8.2.6.Final, 9.0.0.Final
Component/s: None
Labels:
None

Forum Reference:
https://developer.jboss.org/message/971397#971397

Scenario:

3 nodes, server mode with Partition handling enabled
2 nodes are killed and bring back online
the nodes are unable to merge and the cluster remains in degraded mode.

I suspect that the FORK channel/protocol is the culprit since the heartbeat command is never handled in the joiner node, but the coordinator receives a CacheNotFoundResponse quickly (i.e. without timeout). The request is received and "delivered" but never reaches Infinispan.

When starting node 1 (logs from coordinator):

Received new cluster view: 5, isCoordinator = true, old status = COORDINATOR
Received new cluster view: 5, isCoordinator = true, old status = COORDINATOR
//hearbeat sent, ClusterTopologyManagerImpl.confirmMembersAvailable();
Responses: value=CacheNotFoundResponse, received=true, suspected=false
Node node01-47572 left while updating cache members
//the view is not handled

When I started node 2:

Received new cluster view: 6, isCoordinator = true, old status = COORDINATOR
Updating cluster members for all the caches. New list is [node03-48579, node01-47572, node02-32959]
//hearbeat sent, ClusterTopologyManagerImpl.confirmMembersAvailable();
Responses: Responses{
  node01-47572: value=SuccessfulResponse{responseValue=true} , received=true, suspected=false
  node02-32959: value=CacheNotFoundResponse, received=true, suspected=false}
Node node02-32959 left while updating cache members
//the view is not handled

It is always reproducible. The configuration is

<replicated-cache name="default" mode="SYNC" batching="true">
  <partition-handling enabled="true"/>
  <locking isolation="REPEATABLE_READ"/>
<state-transfer enabled="false"/>

is related to

ISPN-5290 Better automatic merge for caches with enabled partition handling

Closed

Assignee:: Unassigned

Reporter:: Pedro Ruivo

Archiver:: Amol Dongare

Created:: 2017/05/04 10:10 AM

Updated:: 2018/04/30 8:48 AM

Resolved:: 2018/04/30 8:48 AM

Archived:: 2024/11/28 6:21 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty