Uploaded image for project: 'Red Hat Fuse'
  1. Red Hat Fuse
  2. ENTESB-12206

Intermittent deadlock between FelixStartLevel thread and SpringOsgiExtender thread involving BundleWiringImpl lock

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • fuse-6.3-R15-GA
    • jboss-fuse-6.3
    • Camel
    • None
    • Medium
    • % %

    Description

      We occasionally see Fuse instances hanging after a restart or installing a new feature. Looking at a thread dump for a hung instance, it looks like there is a deadlock between the FelixStartLevel thread and a SpringOsgiExtender thread. This deadlock does not occur every single time, so we are able to work around it by restarting affected containers one or more times until they start clean. Here was the deadlock found:

      Found one Java-level deadlock:
      =============================
      "sshd-SshServer[6eb0fa7f]-nio2-thread-2":
        waiting to lock monitor 0x00007fc1c80067d8 (object 0x00000006c01645f0, a org.apache.felix.framework.BundleWiringImpl),
        which is held by "SpringOsgiExtenderThread-5"
      "SpringOsgiExtenderThread-5":
        waiting to lock monitor 0x00007fc224002168 (object 0x00000006c02b1548, a org.apache.felix.framework.BundleWiringImpl),
        which is held by "FelixStartLevel"
      "FelixStartLevel":
        waiting to lock monitor 0x00007fc1c80067d8 (object 0x00000006c01645f0, a org.apache.felix.framework.BundleWiringImpl),
        which is held by "SpringOsgiExtenderThread-5"
      
      Java stack information for the threads listed above:
      ===================================================
      "sshd-SshServer[6eb0fa7f]-nio2-thread-2":
      	at org.apache.felix.framework.BundleWiringImpl.getClassLoaderInternal(BundleWiringImpl.java:680)
      	- waiting to lock <0x00000006c01645f0> (a org.apache.felix.framework.BundleWiringImpl)
      	at org.apache.felix.framework.BundleWiringImpl.getClassByDelegation(BundleWiringImpl.java:1387)
      	at org.apache.felix.framework.BundleWiringImpl.searchImports(BundleWiringImpl.java:1577)
      	at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1507)
      	at org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:77)
      	at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1993)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      	at org.apache.karaf.jaas.modules.audit.EventAdminAuditLoginModule$EventAdminAuditor.audit(EventAdminAuditLoginModule.java:66)
      	at org.apache.karaf.jaas.modules.audit.EventAdminAuditLoginModule.audit(EventAdminAuditLoginModule.java:55)
      	at org.apache.karaf.jaas.modules.audit.AbstractAuditLoginModule.login(AbstractAuditLoginModule.java:65)
      	at org.apache.karaf.jaas.boot.ProxyLoginModule.login(ProxyLoginModule.java:83)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
      	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
      	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
      	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
      	at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
      	at org.apache.karaf.shell.ssh.KarafJaasAuthenticator.authenticate(KarafJaasAuthenticator.java:99)
      	at org.apache.sshd.server.auth.UserAuthPublicKey.doAuth(UserAuthPublicKey.java:72)
      	at org.apache.sshd.server.auth.AbstractUserAuth.auth(AbstractUserAuth.java:49)
      	at org.apache.sshd.server.session.ServerUserAuthService.process(ServerUserAuthService.java:145)
      	at org.apache.sshd.common.session.AbstractSession.doHandleMessage(AbstractSession.java:431)
      	at org.apache.sshd.common.session.AbstractSession.handleMessage(AbstractSession.java:326)
      	- locked <0x00000007a2832ec0> (a java.lang.Object)
      	at org.apache.sshd.common.session.AbstractSession.decode(AbstractSession.java:780)
      	at org.apache.sshd.common.session.AbstractSession.messageReceived(AbstractSession.java:308)
      	- locked <0x00000007a2848438> (a java.lang.Object)
      	at org.apache.sshd.common.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:54)
      	at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:184)
      	at org.apache.sshd.common.io.nio2.Nio2Session$1.onCompleted(Nio2Session.java:170)
      	at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:32)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:30)
      	at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
      	at sun.nio.ch.Invoker$2.run(Invoker.java:218)
      	at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      "SpringOsgiExtenderThread-5":
      	at org.apache.felix.framework.BundleWiringImpl.isInUse(BundleWiringImpl.java:522)
      	- waiting to lock <0x00000006c02b1548> (a org.apache.felix.framework.BundleWiringImpl)
      	at org.apache.felix.framework.BundleWiringImpl.listResourcesInternal(BundleWiringImpl.java:806)
      	at org.apache.felix.framework.BundleWiringImpl.calculateRemotePackageResources(BundleWiringImpl.java:975)
      	at org.apache.felix.framework.BundleWiringImpl.listResourcesInternal(BundleWiringImpl.java:841)
      	at org.apache.felix.framework.BundleWiringImpl.listResources(BundleWiringImpl.java:783)
      	- locked <0x00000006c01645f0> (a org.apache.felix.framework.BundleWiringImpl)
      	at org.ops4j.pax.web.utils.ClassPathUtil.findResources(ClassPathUtil.java:208)
      	at org.ops4j.pax.web.extender.war.internal.parser.WebAppParser.tldScan(WebAppParser.java:209)
      	at org.ops4j.pax.web.extender.war.internal.parser.WebAppParser.parse(WebAppParser.java:142)
      	at org.ops4j.pax.web.extender.war.internal.WebObserver.createExtension(WebObserver.java:145)
      	at org.ops4j.pax.web.extender.war.internal.Activator.doCreateExtension(Activator.java:82)
      	at org.ops4j.pax.web.extender.war.internal.extender.AbstractExtender.createExtension(AbstractExtender.java:261)
      	at org.ops4j.pax.web.extender.war.internal.extender.AbstractExtender.modifiedBundle(AbstractExtender.java:244)
      	at org.ops4j.pax.web.extender.war.internal.extender.AbstractExtender.modifiedBundle(AbstractExtender.java:55)
      	at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:479)
      	at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:1)
      	at org.osgi.util.tracker.AbstractTracked.track(AbstractTracked.java:232)
      	at org.osgi.util.tracker.BundleTracker$Tracked.bundleChanged(BundleTracker.java:443)
      	at org.apache.felix.framework.util.EventDispatcher.invokeBundleListenerCallback(EventDispatcher.java:869)
      	at org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:790)
      	at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:515)
      	at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4429)
      	at org.apache.felix.framework.StatefulResolver.fireResolvedEvents(StatefulResolver.java:1126)
      	at org.apache.felix.framework.StatefulResolver.resolve(StatefulResolver.java:439)
      	at org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3993)
      	at org.apache.felix.framework.Felix.findBundleEntries(Felix.java:1736)
      	at org.apache.felix.framework.BundleImpl.findEntries(BundleImpl.java:327)
      	at org.apache.camel.core.osgi.OsgiPackageScanClassResolver.getImplementationsInBundle(OsgiPackageScanClassResolver.java:98)
      	at org.apache.camel.core.osgi.OsgiPackageScanClassResolver.loadImplementationsInBundle(OsgiPackageScanClassResolver.java:78)
      	at org.apache.camel.core.osgi.OsgiPackageScanClassResolver.find(OsgiPackageScanClassResolver.java:50)
      	at org.apache.camel.impl.DefaultPackageScanClassResolver.findImplementations(DefaultPackageScanClassResolver.java:178)
      	at org.apache.camel.spring.PackageScanRouteBuilderFinder.appendBuilders(PackageScanRouteBuilderFinder.java:59)
      	at org.apache.camel.spring.CamelContextFactoryBean.findRouteBuildersByPackageScan(CamelContextFactoryBean.java:230)
      	at org.apache.camel.core.xml.AbstractCamelContextFactoryBean.findRouteBuilders(AbstractCamelContextFactoryBean.java:943)
      	at org.apache.camel.core.xml.AbstractCamelContextFactoryBean.setupRoutes(AbstractCamelContextFactoryBean.java:385)
      	at org.apache.camel.spring.CamelContextFactoryBean.onApplicationEvent(CamelContextFactoryBean.java:334)
      	at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:96)
      	at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:335)
      	at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:958)
      	at org.springframework.osgi.context.support.AbstractOsgiBundleApplicationContext.finishRefresh(AbstractOsgiBundleApplicationContext.java:235)
      	at org.springframework.osgi.context.support.AbstractDelegatedExecutionApplicationContext$4.run(AbstractDelegatedExecutionApplicationContext.java:358)
      	- locked <0x00000006c24de888> (a java.lang.Object)
      	at org.springframework.osgi.util.internal.PrivilegedUtils.executeWithCustomTCCL(PrivilegedUtils.java:85)
      	at org.springframework.osgi.context.support.AbstractDelegatedExecutionApplicationContext.completeRefresh(AbstractDelegatedExecutionApplicationContext.java:320)
      	at org.springframework.osgi.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor$CompleteRefreshTask.run(DependencyWaiterApplicationContextExecutor.java:132)
      	at java.lang.Thread.run(Thread.java:748)
      "FelixStartLevel":
      	at org.apache.felix.framework.BundleWiringImpl.isInUse(BundleWiringImpl.java:522)
      	- waiting to lock <0x00000006c01645f0> (a org.apache.felix.framework.BundleWiringImpl)
      	at org.apache.felix.framework.BundleWiringImpl.listResourcesInternal(BundleWiringImpl.java:806)
      	at org.apache.felix.framework.BundleWiringImpl.calculateRemotePackageResources(BundleWiringImpl.java:975)
      	at org.apache.felix.framework.BundleWiringImpl.listResourcesInternal(BundleWiringImpl.java:841)
      	at org.apache.felix.framework.BundleWiringImpl.listResources(BundleWiringImpl.java:783)
      	- locked <0x00000006c02b1548> (a org.apache.felix.framework.BundleWiringImpl)
      	at org.ops4j.pax.web.utils.ClassPathUtil.findResources(ClassPathUtil.java:208)
      	at org.ops4j.pax.web.extender.war.internal.parser.WebAppParser.tldScan(WebAppParser.java:209)
      	at org.ops4j.pax.web.extender.war.internal.parser.WebAppParser.parse(WebAppParser.java:142)
      	at org.ops4j.pax.web.extender.war.internal.WebObserver.createExtension(WebObserver.java:145)
      	at org.ops4j.pax.web.extender.war.internal.Activator.doCreateExtension(Activator.java:82)
      	at org.ops4j.pax.web.extender.war.internal.extender.AbstractExtender.createExtension(AbstractExtender.java:261)
      	at org.ops4j.pax.web.extender.war.internal.extender.AbstractExtender.modifiedBundle(AbstractExtender.java:244)
      	at org.ops4j.pax.web.extender.war.internal.extender.AbstractExtender.modifiedBundle(AbstractExtender.java:55)
      	at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:479)
      	at org.osgi.util.tracker.BundleTracker$Tracked.customizerModified(BundleTracker.java:1)
      	at org.osgi.util.tracker.AbstractTracked.track(AbstractTracked.java:232)
      	at org.osgi.util.tracker.BundleTracker$Tracked.bundleChanged(BundleTracker.java:443)
      	at org.apache.felix.framework.util.EventDispatcher.invokeBundleListenerCallback(EventDispatcher.java:869)
      	at org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:790)
      	at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:515)
      	at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4429)
      	at org.apache.felix.framework.Felix.startBundle(Felix.java:2100)
      	at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1299)
      	at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
      	at java.lang.Thread.run(Thread.java:748)
      
      Found 1 deadlock.
      

      The FelixStartLevel thread was just starting a WAR bundle, because it leads to notification to pax-web-extender-war:

       org.ops4j.pax.web.extender.war.internal.extender.AbstractExtender.createExtension(AbstractExtender.java:261)
      

      but at the same time, SpringOsgiExtenderThread-5 thread was finishing processing of some bundle (probably previously started) containing spring.xml which, upon spring context refresh, passes control to camel which scans available (visible) bundles for some resources. And this (surprisingly) leads to another resolution of the bundle which notifies pax-web-extender-war.

      It looks like looks like org.apache.camel.core.osgi.OsgiPackageScanClassResolver should check bundle state and skip unresolved bundles instead of scanning all of available (visible) bundles.

      Attachments

        Activity

          People

            ggrzybek Grzegorz Grzybek
            rhn-support-qluo Joe Luo
            Vratislav Hais Vratislav Hais (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: