-
Bug
-
Resolution: Done
-
Major
-
12.1.4.Final
-
None
When a cluster is shut down gracefully, all the nodes save the consistent hash of their caches in their persistent state. When starting back the cluster, the coordinator uses this CH to block startup of caches until all the nodes have joined.
The persisted CH is ignored once all the pre-shutdown nodes have started, and new nodes without a persisted CH can join. Because it's not useful while the cache has at least one member, new nodes without the persisted CH can become coordinator.
However, the persisted CH remains on the disk of the initial nodes. If such node is restarted it will send the same persisted CH to the coordinator, and if the current coordinator is a new node that doesn't have a persisted CH the restart fails:
org.infinispan.commons.CacheConfigurationException: Error starting component org.infinispan.statetransfer.StateTransferManager at org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:572) at org.infinispan.factories.impl.BasicComponentRegistryImpl.access$700(BasicComponentRegistryImpl.java:30) at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:787) at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:354) at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:250) at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:213) at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1015) at org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:512) at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:698) at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:644) at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:533) at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:511) at org.infinispan.test.MultipleCacheManagersTest.cache(MultipleCacheManagersTest.java:530) at org.infinispan.globalstate.ThreeNodeDistGlobalStateRestartTest.testGracefulShutdownAndRestart(ThreeNodeDistGlobalStateRestartTest.java:32) Caused by: java.util.concurrent.CompletionException: org.infinispan.topology.CacheJoinException: ISPN000408: Node Test-NodeH with persistent state attempting to join cache testCache on cluster without state at org.infinispan.util.concurrent.CompletionStages.join(CompletionStages.java:81) at org.infinispan.statetransfer.StateTransferManagerImpl.start(StateTransferManagerImpl.java:134) at org.infinispan.statetransfer.CorePackageImpl$2.start(CorePackageImpl.java:104) at org.infinispan.statetransfer.CorePackageImpl$2.start(CorePackageImpl.java:83) at org.infinispan.factories.impl.BasicComponentRegistryImpl.invokeStart(BasicComponentRegistryImpl.java:604) at org.infinispan.factories.impl.BasicComponentRegistryImpl.doStartWrapper(BasicComponentRegistryImpl.java:595) at org.infinispan.factories.impl.BasicComponentRegistryImpl.startWrapper(BasicComponentRegistryImpl.java:564) ... 40 more Caused by: org.infinispan.topology.CacheJoinException: ISPN000408: Node Test-NodeH with persistent state attempting to join cache testCache on cluster without state
- causes
-
JDG-4092 After a "cluster shutdown" there is no way to bring nodes simple down
- Closed
-
JDG-5029 [Operator] Data loss upon node-0 deletion after upgrade to 8.3.0
- Closed
- is duplicated by
-
ISPN-12262 After a "cluster shutdown" there is no way to bring nodes simple down
- Closed