Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-13686

Deadlock on Wildfly startup

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • 20.0.1.Final
    • EJB
    • None
    • Hide

      SingletonA.java:

      @Singleton
      public class SingletonA {
      private static final Logger log = LoggerFactory.getLogger(SingletonA.class);
      private static final List<Integer> DATA = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9);
      @PostConstruct
       public void postConstruct() {
       log.info("hello from SingletonA");
       }
      public List<Integer> getCachedData() {
       return Collections.unmodifiableList(DATA);
       }
      }
      

       

      SingletonB.java:

      @Singleton
      @Startup
      public class SingletonB {
      private static final Logger log = LoggerFactory.getLogger(SingletonB.class);
      @Inject
       private SingletonA singletonA;
      @Inject
       private ServiceC serviceC;
      @Resource
       private ManagedExecutorService managedExecutorService;
      @PostConstruct
       public void postConstruct() {
       log.info("hello from SingletonB");
      managedExecutorService.execute(() -> serviceC.run());
      log.info("job scheduled; continuing initialization");
      sleep(500); // some other stuff
      List<Integer> data = singletonA.getCachedData();
      log.info("got {} elements in cache", data.size());
       }
      private static void sleep(long ms) {
       try {
       Thread.sleep(ms);
       } catch (InterruptedException e) {
       throw new IllegalStateException(e);
       }
       }
      }
      

       

       

      ServiceC.java:

      @ApplicationScoped
      public class ServiceC {
      private static final Logger log = LoggerFactory.getLogger(ServiceC.class);
      private static final Random RANDOM = new Random();
      @Inject
       private SingletonA singletonA;
      @PostConstruct
       public void postConstruct() {
       log.info("hello from ServiceC");
       }
      public void run() {
       List<Integer> data = singletonA.getCachedData();
       int index = RANDOM.nextInt(data.size());
       log.info("next number: {}", data.get(index));
       }
      }
       

       

      Show
      SingletonA.java: @Singleton public class SingletonA { private static final Logger log = LoggerFactory.getLogger(SingletonA.class); private static final List< Integer > DATA = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9); @PostConstruct public void postConstruct() { log.info( "hello from SingletonA" ); } public List< Integer > getCachedData() { return Collections.unmodifiableList(DATA); } }   SingletonB.java: @Singleton @Startup public class SingletonB { private static final Logger log = LoggerFactory.getLogger(SingletonB.class); @Inject private SingletonA singletonA; @Inject private ServiceC serviceC; @Resource private ManagedExecutorService managedExecutorService; @PostConstruct public void postConstruct() { log.info( "hello from SingletonB" ); managedExecutorService.execute(() -> serviceC.run()); log.info( "job scheduled; continuing initialization" ); sleep(500); // some other stuff List< Integer > data = singletonA.getCachedData(); log.info( "got {} elements in cache" , data.size()); } private static void sleep( long ms) { try { Thread .sleep(ms); } catch (InterruptedException e) { throw new IllegalStateException(e); } } }     ServiceC.java: @ApplicationScoped public class ServiceC { private static final Logger log = LoggerFactory.getLogger(ServiceC.class); private static final Random RANDOM = new Random(); @Inject private SingletonA singletonA; @PostConstruct public void postConstruct() { log.info( "hello from ServiceC" ); } public void run() { List< Integer > data = singletonA.getCachedData(); int index = RANDOM.nextInt(data.size()); log.info( "next number: {}" , data.get(index)); } }    
    • Workaround Exists
    • Hide

      There is no workaround for deadlock caused by too early client access. Workaround for deadlock caused by jobs from executor service: the jobs in managed executor service should be started by some other bean D that also is started on a managed executor. This way D will hold on StartupAwaitInterceptor but will not block any singleton. After all singletons are ready the StartupAwait will be released and D will start the jobs (in terms of my example D will execute managedExecutorService.execute(() -> serviceC.run()); )

      Show
      There is no workaround for deadlock caused by too early client access. Workaround for deadlock caused by jobs from executor service: the jobs in managed executor service should be started by some other bean D that also is started on a managed executor. This way D will hold on StartupAwaitInterceptor but will not block any singleton. After all singletons are ready the StartupAwait will be released and D will start the jobs (in terms of my example D will execute managedExecutorService.execute(() -> serviceC.run()); )

    Description

      On Wildfly startup there can be a deadlock related to ejb/singleton access and more specifically: StartupAwaitInterceptor and ContainerManagedConcurrencyInterceptor. This can happen when there is a too early client request (occurring during app startup) or a request caused by thread running in managed executor (that's what happened to me). A thread that is blocked by StartupAwaitInterceptor also holds a lock from ContainerManagedConcurrencyInterceptor and blocks other threads. This is related to the following pull request, link to the comment: https://github.com/wildfly/wildfly/pull/9009#issuecomment-656147415 .

      I guess possible solution is to change interceptors ordering. Other possibility is to add "privileged" flag (see pull request for explanation) to threads from managed thread factory but in this case a too early client request could also cause a dealock.

       

      Scenario of deadlock (description copied from pull request's comment):

      • startup singleton A's initialization starts and completes successfully
      • startup singleton B is initializing and during that it starts a task X via managedThreadExecutor
      • X wants to access A and is blocked by StartupCountdown.await
      • meanwhile B continues initializing and wants to access A but X already holds a lock on A (I can see ContainerManagedConcurrencyInterceptor.processInvocation in the tread dump) hence after 5000ms B's initialization fails as well as whole deployment

      Attachments

        Issue Links

          Activity

            People

              tadamski@redhat.com Tomasz Adamski
              kryszard K G (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: