Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-4440

Remove setMaxCollectorSize API from MapReduceTask

    XMLWordPrintable

Details

    • Task
    • Resolution: Done
    • Minor
    • 7.0.0.Alpha5
    • 7.0.0.Alpha4
    • Clustered Executor
    • None

    Description

      During the refinement of parallel execution of M/R algorithm we introduced an abstraction maxCollectorSize on the level of MapReduceTask. The ideas was that during execution of map/combine phase, number of intermediate keys/values collected in a Collector could potentially become very large. By limiting size of collector, intermediate key/values are transferred to intermediate cache in batches before reduce phase is executed and OutOfMemoryError issues are avoided as well.

      However, during the extensive performance phase Alan Field, Dan Berindei and I have concluded that maxCollectorSize set to 10000 entries gives the best trade off between performance and memory use. Therefore there is no need to expose this value to MapReduceTask users.

      Having said that there might be some uses cases where holding 10000 intermediate large memory footprint objects might lead to OOM, and in such cases users should allocate more heap to MapReduceTasks. We might consider introducing again this API should such a need arise.

      Attachments

        Activity

          People

            vblagoje Vladimir Blagojevic (Inactive)
            vblagoje Vladimir Blagojevic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: