Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Won't Do
Priority: Major
Fix Version/s: None
Affects Version/s: RHDG 8.2.1 GA
Component/s: API and Configuration
Labels:
None

Blocked:
False
Ready:
False
CDW devel_ack:
CDW docs_ack:
CDW pm_ack:
CDW qa_ack:
CDW release:
Release Note Text:
Undefined
Target Release:

RHDG BACKLOG GA
Git Pull Request:
https://github.com/infinispan/infinispan/pull/10167
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

After a graceful restart, the persisted UUIDs are used to re-create the consistent hash of the cache before shutdown. This initial CH will not be rebalanced, so there is no state transfer immediately after cluster restart.

However, if something then triggers a rebalance (e.g. a node join/leave), the persistent UUIDs are ignored, and SyncConsistentHashFactory allocates segments based on the new JGroups addresses instead of the persistent UUIDs.

I modified ThreeNodeDistGlobalStateRestartTest to force a rebalance after restart, and I got

11:24:07,424 TRACE (jgroups-7,Test-NodeD:[]) [ClusterCacheStatus] Cache testCache topology updated: CacheTopology{id=1, phase=NO_REBALANCE, rebalanceId=1, currentCH=DefaultConsistentHash{ns=256, owners = (3)[Test-NodeD: 83+0, Test-NodeE: 87+0, Test-NodeF: 86+0]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeD, Test-NodeE, Test-NodeF], persistentUUIDs=[1ba71c04-a6b9-4a5c-9f51-e5e358081dc6, 6d3ff549-aafa-4d8a-8617-84ac6f119549, f37f6a8c-32a4-4dda-b1b0-876c24f42c6a]}, members = [Test-NodeD, Test-NodeE, Test-NodeF], joiners = []
11:24:07,889 TRACE (testng-Test:[]) [ClusterCacheStatus] Rebalancing consistent hash for cache testCache, members are [Test-NodeD, Test-NodeE, Test-NodeF]
11:24:07,909 TRACE (testng-Test:[]) [ClusterCacheStatus] Updating cache testCache topology for rebalance: CacheTopology{id=2, phase=READ_OLD_WRITE_ALL, rebalanceId=2, currentCH=DefaultConsistentHash{ns=256, owners = (3)[Test-NodeD: 83+0, Test-NodeE: 87+0, Test-NodeF: 86+0]}, pendingCH=DefaultConsistentHash{ns=256, owners = (3)[Test-NodeD: 87+0, Test-NodeE: 83+0, Test-NodeF: 86+0]}, unionCH=null, actualMembers=[Test-NodeD, Test-NodeE, Test-NodeF], persistentUUIDs=[1ba71c04-a6b9-4a5c-9f51-e5e358081dc6, 6d3ff549-aafa-4d8a-8617-84ac6f119549, f37f6a8c-32a4-4dda-b1b0-876c24f42c6a]}
11:24:07,910 TRACE (testng-Test:[]) [ClusterCacheStatus] Moved segments: [Test-NodeD added 72 removed 68, Test-NodeE added 49 removed 53, Test-NodeF added 59 removed 59]

This issue does not affect caches using DefaultConsistentHashFactory, because it doesn't care about member UUIDs. Since there is no SyncScatteredConsistentHashFactory, scattered cache are not affected at all. Replicated caches with the default SyncReplicateedConsistentHashFactory will change primary owners, but they won't need any state transfer.

TestingUtil.waitForNoRebalance() works around the issue by not checking whether the initial consistent hash (with topologyId==1) is balanced.

As well a node which is restarted should be remembered as the original node to let the cluster go back to available if exactly this node is missing

clones

ISPN-12350 Persistent UUIDs are only used for initial consistent hash

Closed

Assignee:: Jose Bolina

Reporter:: Wolf Fink

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2021/11/22 5:34 PM

Updated:: 2023/06/29 7:59 PM

Resolved:: 2023/06/29 1:35 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates