-
Sub-task
-
Resolution: Done
-
Major
-
None
-
None
The startService() implementation HASingletonSupport inherits from HAServiceMBeanSupport has a slight potential for deadlock is a cluster topology change occurs while the singleton service itself is being deployed. The only known use case where this would occur is with the HASingletonDeployer service.
Details:
In Thread A
1) HASingletonDeployerServices is being deployed, and therefore has synchronized on org.jboss.system.ServiceController.
2) Calls DRM.registerListener()
3) Call DRM.add() (this is the next line of code)
4) As part of add processing, DRM callsback to the HASingleton.
5) Inside a synchronized block in the callback method, singleton determines if it is the master node, goes on to do its work.
Problem occurs if a cluster topology change occurs between steps 2 and 3. In that case, the following would happen in another thread, Thread B.
1) Topology changes, so DRM notifies listeners.
2) Our HASingleton is registered as a listener, so step 5 above occurs.
3) Since its the master, goes and tries to deploy things in deploy-hasingleton.
4) Deployment can't proceed because Thread A has synchronized on org.jboss.system.ServiceController.
5) Thread A can't proceed because Thread B is stuck inside the synchronized block in the callback method. Deadlock.
This is an unlikely scenario, but I'm marking this issue as major since if it does occur it deadlocks the node.
A likely fix will involve overriding the startService() implemetation so it doesn't rely on the callback to determine whether or not its the master node. Instead it directly does what the callback code does, and then registers as a listener. Have to be careful not to drop any topology changes in the middle.