Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-9817

OOM Error on ExposedByteArrayOutputStream

    XMLWordPrintable

Details

    • Bug
    • Resolution: Obsolete
    • Critical
    • 7.2.4.Final
    • 7.2.4.Final
    • None
    • None
    • Hide

      *We need to address two issue here

      1. We need to control amount of data to be replicated. In ReplicationQueueImpl.java flush method has drainReplQueue() method. This method drains entire queue, may result in create large byte array which gets fail one OOM.

      Even though we minimize frquency as low as possible, there is possibility huge data coming in.

      Resolution - We need to have explicit control on how much queue is getting drained before replication begins.

      2. Async replication is becoming Sync replication. This is because the way add method implemented. It calls flush method on checking max element converts async replication to sync.

      This also needs to be fixed.
      *


      1. Application threads frequently calling put method on replicated cache results in calling flush method of ReplicationQueueImpl.java
      2. This cause application threads to wait for every 500th put call to complete the cache replication from the queue
      3. This becomes kind of sync replication which blocks application threads.
      4. To avoid this situation, we can increase the queue size large enough, which, apparently, does not have any side effect as queue is linked blocking queue and application threads will only get blocked when queue becomes full.
      5. However this puts pressure on aysnc queue, which has to replicate entire queue at once.

      _replicationQueue-thread-p4-t1 tid=119 [RUNNABLE] [DAEMON] <-- OutOfMemoryError happened in this thread
      java.lang.OutOfMemoryError.<init>() OutOfMemoryError.java:48
      org.infinispan.commons.io.ExposedByteArrayOutputStream.write(byte[], int, int) ExposedByteArrayOutputStream.java:71
      _

      6. This out of memory happens when JVM fails to allocate continuations chunk of memory in the form of array of 1 or 2 GB

      Show
      *We need to address two issue here 1. We need to control amount of data to be replicated. In ReplicationQueueImpl.java flush method has drainReplQueue() method. This method drains entire queue, may result in create large byte array which gets fail one OOM. Even though we minimize frquency as low as possible, there is possibility huge data coming in. Resolution - We need to have explicit control on how much queue is getting drained before replication begins. 2. Async replication is becoming Sync replication. This is because the way add method implemented. It calls flush method on checking max element converts async replication to sync. This also needs to be fixed. * 1. Application threads frequently calling put method on replicated cache results in calling flush method of ReplicationQueueImpl.java 2. This cause application threads to wait for every 500th put call to complete the cache replication from the queue 3. This becomes kind of sync replication which blocks application threads. 4. To avoid this situation, we can increase the queue size large enough, which, apparently, does not have any side effect as queue is linked blocking queue and application threads will only get blocked when queue becomes full. 5. However this puts pressure on aysnc queue, which has to replicate entire queue at once. _replicationQueue-thread- p4-t1 tid=119 [RUNNABLE] [DAEMON] < -- OutOfMemoryError happened in this thread java.lang.OutOfMemoryError.<init>() OutOfMemoryError.java:48 org.infinispan.commons.io.ExposedByteArrayOutputStream.write(byte[], int, int) ExposedByteArrayOutputStream.java:71 _ 6. This out of memory happens when JVM fails to allocate continuations chunk of memory in the form of array of 1 or 2 GB

    Description

      Titile - OOM Error on ExposedByteArrayOutputStream

      Data -

      1. Replication Mode is Async
      2. queue-size="500"
      3. queue-flush-interval="10000"

      Details -

      1. Application threads frequently calling put method on replicated cache results in calling flush method of ReplicationQueueImpl.java
      2. This cause application threads to wait for every 500th put call to complete the cache replication from the queue
      3. This becomes kind of sync replication which blocks application threads.
      4. To avoid this situation, we can increase the queue size large enough, which, apparently, does not have any side effect as queue is linked blocking queue and application threads will only get blocked when queue becomes full.
      5. However this puts pressure on aysnc queue, which has to replicate entire queue at once.

      _replicationQueue-thread-p4-t1 tid=119 [RUNNABLE] [DAEMON] <-- OutOfMemoryError happened in this thread
      java.lang.OutOfMemoryError.<init>() OutOfMemoryError.java:48
      org.infinispan.commons.io.ExposedByteArrayOutputStream.write(byte[], int, int) ExposedByteArrayOutputStream.java:71
      _

      6. This out of memory happens when JVM fails to allocate continuations chunk of memory in the form of array of 1 or 2 GB

      Summary - If we set queue size to normal or low level, application threads result in calling flush which turns out to be sync replication which blocks other application threads. And, if I increase the queue size to maximum enough so as to avoid sync flush then replication queue throws OOM

      Attachments

        Activity

          People

            Unassigned Unassigned
            rvende Rakesh Vende (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: