-
Enhancement
-
Resolution: Done
-
Major
-
None
-
None
When sub-clusters merge, the current DefaultMembershipPolicy collects all coordinators from the sub-clusters and sorts it, then chooses the first one as the next coordinator (JGRP-1002). The sort is done by UUID which is random. Please consider to change the algorithm slightly to choose the one from the largest (in terms of the number of members) sub-cluster and make it the default.
Choosing the largest one is reasonable from many point of views. For example, a WildFly cluster hosting a singleton service becomes more robust. Another scenario is about a large Inifnispan cluster. When you add an extra node and the node resides into another network segment, the new node tends to become an isolated coordinator initially (because a switch requires some time to reflect a new multicast route between the segments) then merges later. With the current implementation, this new node becomes the coordinator frequently and affects stability of the cluster by a cluster-wide rebalance. This is bad because the intention is to ease the cluster, not to unstabilize it.
The implementation can be customized by membership_change_policy property of pbcast.GMS. I attach my implementation as LargestWinningPolicy.java as a reference.