Details
-
Bug
-
Resolution: Done
-
Major
-
9.1.4.Final
-
None
-
Sprint 9.3.0.Final
Description
At the beginning I have main cluster consisted of 8 nodes.
Then I disconnected main switch on which these nodes were connected.
This leaded to separating main cluster to 2 subclusters - first with 2 nodes and second with 6 nodes. This was expected.
After that I rebooted the nodes. After reboot, nodes again correctly formed 2 subclusters with 2 and 6 members.
After a long time when all nodes were stable with low cpu load, I connected the main switch back which should lead to recreation of main cluster with 8 controllers.
However main cluster did not recovered:
subcluster2 did not change - still had 6 nodes connected - no new members
subcluster1 - nodes did not connect with subcluster2 and after cca 30min they left the cluster.
When I checked infinispan logs of node1 from 1st subcluster I had IllegalLifecycleStateException for every created cache (see included logs.zip):
[transport-thread-744a974a-2811-4f79-ac63-f32daf005d7f-p4-t6] (ClusterCacheStatus.java:599) - ISPN000228: Failed to recover cache XXX state after the current node became the coordinator
org.infinispan.IllegalLifecycleStateException: Cache container has been stopped and cannot be reused. Recreate the cache container.