Loading...

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: Bug
Resolution: Obsolete
Priority: Major
Fix Version/s: None
Affects Version/s: 12.0.0.Final
Component/s: Clustered Locks
Labels:
None

Release Note Text:
Undefined

When the node that owns a clustered lock leaves the cluster, ClusteredLockImpl.ClusterChangeListener is supposed to release the lock. But if the org.infinispan.LOCKS cache is in DEGRADED mode, the lock release fails and an error is logged:

22:01:29,500 ERROR (jgroups-9,Test-NodeD:[]) [CacheManagerNotifierImpl] ISPN000405: Caught exception while invoking a cache manager listener!
org.infinispan.commons.CacheListenerException: ISPN000280: Caught exception [org.infinispan.partitionhandling.AvailabilityException] while invoking method [public void org.infinispan.lock.impl.lock.ClusteredLockImpl$ClusterChangeListener.viewChange(org.infinispan.notifications.cachemanagerlistener.event.ViewChangedEvent)] on listener instance: org.infinispan.lock.impl.lock.ClusteredLockImpl$ClusterChangeListener@3c91530d
	at org.infinispan.notifications.impl.AbstractListenerImpl$ListenerInvocationImpl.lambda$invoke$1(AbstractListenerImpl.java:430)
	at org.infinispan.notifications.impl.AbstractListenerImpl$ListenerInvocationImpl.invoke(AbstractListenerImpl.java:450)
	at org.infinispan.notifications.cachemanagerlistener.CacheManagerNotifierImpl.invokeListener(CacheManagerNotifierImpl.java:157)
	at org.infinispan.notifications.cachemanagerlistener.CacheManagerNotifierImpl.invokeListeners(CacheManagerNotifierImpl.java:84)
	at org.infinispan.notifications.cachemanagerlistener.CacheManagerNotifierImpl.notifyViewChange(CacheManagerNotifierImpl.java:103)
	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.receiveClusterView(JGroupsTransport.java:737)
        ...
Caused by: org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'ClusteredLockKey{name=ConsistentReliabilitySplitBrainTest}' is not available. Not all owners are in this partition
	at org.infinispan.partitionhandling.impl.PartitionHandlingManagerImpl.doCheck(PartitionHandlingManagerImpl.java:272)
	at org.infinispan.partitionhandling.impl.PartitionHandlingManagerImpl.checkRead(PartitionHandlingManagerImpl.java:114)
	at org.infinispan.factories.InternalCacheFactory$PartitionHandlingCache.get(InternalCacheFactory.java:308)
	at org.infinispan.factories.InternalCacheFactory$PartitionHandlingCache.get(InternalCacheFactory.java:306)
	at org.infinispan.factories.InternalCacheFactory$AbstractGetAdvancedCache.containsKey(InternalCacheFactory.java:257)
	at org.infinispan.cache.impl.AbstractDelegatingCache.containsKey(AbstractDelegatingCache.java:384)
	at org.infinispan.cache.impl.EncoderCache.containsKey(EncoderCache.java:618)
	at org.infinispan.lock.impl.manager.EmbeddedClusteredLockManager.isDefined(EmbeddedClusteredLockManager.java:157)
	at org.infinispan.lock.impl.lock.ClusteredLockImpl$ClusterChangeListener.viewChange(ClusteredLockImpl.java:335)
	at jdk.internal.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at org.infinispan.notifications.impl.AbstractListenerImpl$ListenerInvocationImpl.lambda$invoke$1(AbstractListenerImpl.java:424)

When the cache goes back to AVAILABLE mode, there is no other check to see if the lock owner has come back into the cluster or not, so the lock may stay forever owned by a crashed node.

E.g. the initial cluster is ABCD, D owns clustered lock L

The cluster splits into 3 partitions: AB, C, D
LOCKS cache enters DEGRADED mode
A and B try to unlock L, but fail
D crashes
C merges back with AB
LOCKS cache becomes AVAILABLE
L remains owned by D

Unlocking the locks on cluster view changes is also problematic. Because the LOCKS cache enters DEGRADED mode after the cluster view change, if the LOCKS cache is distributed, then it is theoretically possible for a lock to be unlocked and then for its owner to merge back:

E.g. the initial cluster is ABCD, D owns clustered lock L

The cluster splits into 2 partitions: AB and CD
A and B are the 2 owners of L, and A unlocks L
The LOCKS cache enters DEGRADED mode
The partitions merge back
The LOCKS cache becomes AVAILABLE again
D thinks it still owns L, but other nodes are able to acquire it

is related to

ISPN-13352 locks are not cleanedup after node leaves

Closed

Assignee:: Unassigned

Reporter:: Dan Berindei (Inactive)

Archiver:: Amol Dongare

Created:: 2021/02/08 3:45 PM

Updated:: 2024/11/27 2:46 PM

Resolved:: 2024/11/27 12:39 PM

Archived:: 2024/11/28 6:21 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty