Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4964

Map/Reduce intermittently hangs when the cache is updated during execution

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 8.2.0.Final
    • 6.0.1.Final, 6.0.2.Final, 7.0.0.Final
    • Core
    • None

      Map/Reduce tasks on a cache with intensive write operations intermittently hangs, resulting in high CPU and Heap Memory usage. If left like that eventually it crashes the VM after multiple out of memory exceptions.

      Here is a stack dump of a hanging thread:
      Name: transport-thread--p3-t17
      State: RUNNABLE
      Total blocked: 709 Total waited: 15 204

      Stack trace:
      java.util.HashMap.hash(HashMap.java:366)
      java.util.HashMap.put(HashMap.java:496)
      java.util.HashSet.add(HashSet.java:217)
      org.infinispan.persistence.async.AdvancedAsyncCacheLoader.process(AdvancedAsyncCacheLoader.java:81)
      org.infinispan.persistence.manager.PersistenceManagerImpl.processOnAllStores(PersistenceManagerImpl.java:418)
      org.infinispan.persistence.manager.PersistenceManagerImpl.processOnAllStores(PersistenceManagerImpl.java:403)
      org.infinispan.persistence.manager.PersistenceManagerImpl.processOnAllStores(PersistenceManagerImpl.java:398)
      org.infinispan.distexec.mapreduce.MapReduceManagerImpl.map(MapReduceManagerImpl.java:213)
      org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForLocalReduction(MapReduceManagerImpl.java:94)
      org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.invokeMapCombineLocallyForLocalReduction(MapReduceTask.java:1162)
      org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.access$400(MapReduceTask.java:1101)
      org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$2.call(MapReduceTask.java:1133)
      org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$2.call(MapReduceTask.java:1129)
      java.util.concurrent.FutureTask.run(FutureTask.java:262)
      java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      java.util.concurrent.FutureTask.run(FutureTask.java:262)
      java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      java.lang.Thread.run(Thread.java:744)

      The hanging thread keep cycling between these 2 stack entries:
      java.util.HashSet.add(HashSet.java:217)
      org.infinispan.persistence.async.AdvancedAsyncCacheLoader.process(AdvancedAsyncCacheLoader.java:81)

      Here is the configuration for 7.x:
      <?xml version="1.0" encoding="UTF-8"?>
      <infinispan
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="urn:infinispan:config:7.0 http://www.infinispan.org/schemas/infinispan-config-7.0.xsd"
      xmlns="urn:infinispan:config:7.0">
      <cache-container default-cache="default">
      <jmx duplicate-domains="true"/>
      <local-cache name="default">
      <eviction max-entries="5000" strategy="LIRS"/>
      <persistence passivation="true" >
      <!--soft-index-file-store xmlns="urn:infinispan:config:soft-index:7.0"
      purge="false" preload="true">
      <index path="/cache/index" />
      <data path="/cache/data" />
      <write-behind/>
      </soft-index-file-store-->
      <file-store fetch-state="true" preload="true" path="/cache">
      <write-behind/>
      </file-store>
      </persistence>
      <locking isolation="REPEATABLE_READ"/>
      <transaction mode="BATCH" auto-commit="true"/>
      </local-cache>

      <local-cache name="intraday">
      <expiration lifespan="86400000"/>
      </local-cache>
      </cache-container>
      </infinispan>

      The issue also happens on versions 6.x.
      It happens with different combination of locking and transaction settings - that is it happens with transactional and non-transactional caches with different locking configurations. I tried all possible combinations and the issue happens intermittently on all of them.

              Unassigned Unassigned
              alexandre.nikolov Alexandre Nikolov (Inactive)
              Archiver:
              rhn-support-adongare Amol Dongare

                Created:
                Updated:
                Resolved:
                Archived: