Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-5106

Deadlock on GlobalComponentRegistry when starting a cluster

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 7.2.0.Final
    • Component/s: Server
    • Labels:
      None

      Description

      We have a test which starts 4 server nodes, and sometimes they fail to complete the startup. This happens with the current snapshot.
      It appears there's a deadlock on intrinsic locks on GlobalComponentRegistry, since the CacheTopologyControlCommand.POLICY_GET_STATUS is sent with the lock acquired but this lock is also needed for injecting dependencies when the command is processed on the remote node.

      Here are the relevant parts from the dumps, node02:

      "remote-thread--p3-t1" daemon prio=10 tid=0x00007f7a00002800 nid=0x487f waiting for monitor entry [0x00007f796bbfa000]
         java.lang.Thread.State: BLOCKED (on object monitor)
      	at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:262)
      	- waiting to lock <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
      	at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:227)
      	at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:132)
      	at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:156)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      
         Locked ownable synchronizers:
      	- <0x0000000615af46d0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
      
      "MSC service thread 1-16" prio=10 tid=0x00007f79ec071800 nid=0x4839 waiting on condition [0x00007f7a40239000]
         java.lang.Thread.State: TIMED_WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x0000000614d47e60> (a java.util.concurrent.FutureTask)
      	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
      	at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:422)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:199)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:432)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:385)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:368)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:359)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:281)
      	- locked <0x000000060420d4a8> (a java.lang.Object)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.start(ClusterTopologyManagerImpl.java:103)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
      	at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
      	at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
      	at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
      	at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
      	- locked <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
      	at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:221)
      	- locked <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
      	at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:580)
      	at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:546)
      	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:423)
      	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:437)
      	at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:89)
      	at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:80)
      	at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:116)
      	at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:113)
      	at org.infinispan.security.Security.doPrivileged(Security.java:76)
      	at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:60)
      	at org.infinispan.server.infinispan.SecurityActions.startCache(SecurityActions.java:121)
      	at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:79)
      	at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1948)
      	at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1881)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      
         Locked ownable synchronizers:
      	- <0x0000000653444750> (a java.util.concurrent.ThreadPoolExecutor$Worker)
      

      and node03

      "remote-thread--p3-t1" daemon prio=10 tid=0x00007f016c079000 nid=0x1a43 waiting for monitor entry [0x00007f0114396000]
         java.lang.Thread.State: BLOCKED (on object monitor)
      	at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:262)
      	- waiting to lock <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
      	at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:227)
      	at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:132)
      	at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:156)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      
         Locked ownable synchronizers:
      	- <0x0000000615a05750> (a java.util.concurrent.ThreadPoolExecutor$Worker)
      
      "MSC service thread 1-16" prio=10 tid=0x00007f015c071800 nid=0x19ff waiting on condition [0x00007f01b0558000]
         java.lang.Thread.State: TIMED_WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x0000000615025bb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
      	at org.jgroups.util.CondVar.waitFor(CondVar.java:64)
      	at org.jgroups.blocks.Request.waitForResults(Request.java:195)
      	at org.jgroups.blocks.Request.responsesComplete(Request.java:181)
      	at org.jgroups.blocks.Request.execute(Request.java:89)
      	at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:409)
      	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:374)
      	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:188)
      	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:562)
      	at org.infinispan.topology.ClusterTopologyManagerImpl.start(ClusterTopologyManagerImpl.java:112)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
      	at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
      	at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
      	at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
      	at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
      	- locked <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
      	at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:221)
      	- locked <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
      	at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:580)
      	at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:546)
      	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:423)
      	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:437)
      	at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:89)
      	at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:80)
      	at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:116)
      	at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:113)
      	at org.infinispan.security.Security.doPrivileged(Security.java:76)
      	at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:60)
      	at org.infinispan.server.infinispan.SecurityActions.startCache(SecurityActions.java:121)
      	at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:79)
      	at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1948)
      	at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1881)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      
         Locked ownable synchronizers:
      	- <0x00000006534e9628> (a java.util.concurrent.ThreadPoolExecutor$Worker)
      

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  dan.berindei Dan Berindei
                  Reporter:
                  jmarkos Jakub Markos
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: