Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-12423

Infispan thread freezing (STUCK) - dead lock occured

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • 11.0.4.Final
    • Core
    • None
    • Hide

      Just generate huge read/write load

      Show
      Just generate huge read/write load
    • Undefined

      During huge load some threads hangs on

      java.lang.Thread.State: WAITING (parking)             
       java.lang.Thread.State: WAITING (parking)                   
      at sun.misc.Unsafe.park(Native Method)                     

      • parking to wait for  <0x00000005d2e654a8> (a java.util.concurrent.locks.StampedLock)                     
        at java.util.concurrent.locks.StampedLock.acquireRead(StampedLock.java:1215)                     
        at java.util.concurrent.locks.StampedLock.readLock(StampedLock.java:428)             
        at org.infinispan.container.offheap.OffHeapConcurrentMap.peekOrGet(OffHeapConcurrentMap.java:615)                     
        at org.infinispan.container.offheap.OffHeapConcurrentMap.peek(OffHeapConcurrentMap.java:682) 

      I attached Infinispan configuration and three thread dumps:

      • off heap storage enabled (Thread dump.txt)
      • heap storage enabled (Thread dump - heap storage.txt)
      • off heap storage enabled and replicated cache mode="SYNC" (thread dump sync.txt)

      Under high load, Infinspan freezes 100% of the cases.

        1. cache-configuration-jdbc.xml
          4 kB
        2. cache-configuration-jdbc.xml
          4 kB
        3. Thread dump.txt
          2.96 MB
        4. Thread dump - heap storage.txt
          1.12 MB
        5. Thread dump sync.txt
          1.22 MB

            [ISPN-12423] Infispan thread freezing (STUCK) - dead lock occured

            Infinispan issue tracking has been migrated to GitHub issues: https://github.com/infinispan/infinispan/issues
            If you still want this issue to be worked on, create a new issue on GitHub and link this issue.

            Tristan Tarrant added a comment - Infinispan issue tracking has been migrated to GitHub issues: https://github.com/infinispan/infinispan/issues If you still want this issue to be worked on, create a new issue on GitHub and link this issue.

            c0ff75 if you are unable to reproduce the problem with trace logging enabled, or if the logs are 10s of GBs, it may be easier to find the root of the problem from a heap dump.

            We know something is starting a huge chain of CompletableFuture that's possibly leading to stack overflow (e.g. thread "non-blocking-thread--p2-t14" in the heap storage thread dump). The most interesting stack frames ones at the bottom are missing from the thread dumps, because the stack is so deep, but it should be possible to see the full stack in a heap dump.

            Dan Berindei (Inactive) added a comment - c0ff75 if you are unable to reproduce the problem with trace logging enabled, or if the logs are 10s of GBs, it may be easier to find the root of the problem from a heap dump. We know something is starting a huge chain of CompletableFuture that's possibly leading to stack overflow (e.g. thread "non-blocking-thread--p2-t14" in the heap storage thread dump). The most interesting stack frames ones at the bottom are missing from the thread dumps, because the stack is so deep, but it should be possible to see the full stack in a heap dump.

            Will Burns added a comment -

            A stack trace of it hung and the log together would show the best picture. Just make sure your log format includes the thread name

            Will Burns added a comment - A stack trace of it hung and the log together would show the best picture. Just make sure your log format includes the thread name

            Will Burns added a comment -

            Just add

            <Logger name="org.infinispan" level="TRACE"/>
            

            to your loggers.

            Will Burns added a comment - Just add <Logger name= "org.infinispan" level= "TRACE" /> to your loggers.

            Please write me how to correctly enable TRACE level for Infinispan. I use Log4J2 in my app without Sf4J

            Both caches are same, but in testing I use only first cache - GS46CACHE_GS46NEW

            Evgenii Balakhonov (Inactive) added a comment - Please write me how to correctly enable TRACE level for Infinispan. I use Log4J2 in my app without Sf4J Both caches are same, but in testing I use only first cache - GS46CACHE_GS46NEW

            Will Burns added a comment -

            Also just to confirm there are 2 caches in the xml file you posted. Which one is the one that was having the issue?

            Will Burns added a comment - Also just to confirm there are 2 caches in the xml file you posted. Which one is the one that was having the issue?

            Will Burns added a comment -

            Is it possible you could run with org.infinispan set to TRACE level? I know it will be a really big file, but maybe just keep the last bit of it around right before it started to hang?

            Will Burns added a comment - Is it possible you could run with org.infinispan set to TRACE level? I know it will be a really big file, but maybe just keep the last bit of it around right before it started to hang?

            Will Burns,

            Now I tries to switch on old binary memory storage mode:

            <memory max-size="10GB" when-full="REMOVE" storage="BINARY"/>

            And in this mode Infinispan works reliably :-\

            But if I return back storage="OFF_HEAP" then I have after 30-40 minutes high load:

            "[ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'" #31 daemon prio=5 os_prio=0 tid=0x00007f5d8c9a4800 nid=0x2005 waiting on condition [0x00007f5d7e184000]

            java.lang.Thread.State: WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)

            • parking to wait for <0x00000005d72b1140> (a java.util.concurrent.locks.StampedLock)
              at java.util.concurrent.locks.StampedLock.acquireRead(StampedLock.java:1215)
              at java.util.concurrent.locks.StampedLock.readLock(StampedLock.java:428)
              at org.infinispan.container.offheap.OffHeapConcurrentMap.peekOrGet(OffHeapConcurrentMap.java:615)

            Evgenii Balakhonov (Inactive) added a comment - Will Burns, Now I tries to switch on old binary memory storage mode: <memory max-size="10GB" when-full="REMOVE" storage="BINARY"/> And in this mode Infinispan works reliably :-\ But if I return back storage="OFF_HEAP" then I have after 30-40 minutes high load: " [ACTIVE] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'" #31 daemon prio=5 os_prio=0 tid=0x00007f5d8c9a4800 nid=0x2005 waiting on condition [0x00007f5d7e184000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) parking to wait for <0x00000005d72b1140> (a java.util.concurrent.locks.StampedLock) at java.util.concurrent.locks.StampedLock.acquireRead(StampedLock.java:1215) at java.util.concurrent.locks.StampedLock.readLock(StampedLock.java:428) at org.infinispan.container.offheap.OffHeapConcurrentMap.peekOrGet(OffHeapConcurrentMap.java:615)

            Hello, Will Burns!

            I have no any listeners in my code. I use Infinispan very simple: just put and get objects.
            My app and Infinispan are running on Oracle cloud and using Oracle JDK.
            I tried to find in the logs StackOverflowException, but it is not in them.

            Best regards,
            Eugeniy

            Evgenii Balakhonov (Inactive) added a comment - Hello, Will Burns! I have no any listeners in my code. I use Infinispan very simple: just put and get objects. My app and Infinispan are running on Oracle cloud and using Oracle JDK. I tried to find in the logs StackOverflowException, but it is not in them. Best regards, Eugeniy

            Will Burns added a comment -

            Also another question is do you use any listeners in your code? More specifically an eviction listener?

            Will Burns added a comment - Also another question is do you use any listeners in your code? More specifically an eviction listener?

              Unassigned Unassigned
              c0ff75 Evgenii Balakhonov (Inactive)
              Archiver:
              rhn-support-adongare Amol Dongare

                Created:
                Updated:
                Resolved:
                Archived: