There exists a race condition between concurrent elections triggered by different nodes. In general, only 1 node actually runs the election for a given set of singleton candidates. During a deployment replace, there are a rapid series of changes to the candidates as the deployment is stopped and restarted. While each node processes them 1 at a time, this processing isn't synchronized across members. This is the root of the problem, as a new election can be triggered on one node while another node is still in the process of completing its election. Here's the scenario where observed the race condition:
Before deployment replace, sever-one is the primary provider of the singleton service
Each node undeploys its application, and restarts. As each node redeploys, and the singleton service is reinstalled, each node register itself as providing the singleton service. The redeploy happens concurrently, but the registration order appears the same on all nodes.
In this case, the registration order was server-three, server-two, server-one.
- server-three registers first, it elects itself and starts its service
- server-two registers next
- server-three defers election to server-two
- server-two runs the election:
- Elects itself
- Sends a synchronous service stop message to server-three
- Starts its service
- server-one registers next, while server-two is in the process of stopping the service on server-three
- server-three defers election to server-one
- server-two is still in the election process, but will defer election to server-one once complete
- server-one runs the election:
- Elects itself
- Sends service stop message to server-two and server-three
- server-three is no longer running its service
- server-two hasn't yet started its service, but it will soon (This is the problem)
- server-one starts its service
- Meanwhile, server-two just received its response that the stop of service (2.B.II.) and commences its own service start (1.B.III.)
Now server-one and server-two are both running the service.