-
Enhancement
-
Resolution: Obsolete
-
Minor
-
None
-
None
-
None
Currently in intermediate phase [1] as values for every KOut are being collected/grouped in the cluster we simply add every VOut to the list of values keyed by KOut. In reducer phase we then send a reducer function that gets invoked on List<VOut> and reduces those values into single VOut. However, this could be suboptimal as we can invoke reducer function as every VOut gets grouped rather than wait for all VOut values to be collected and then invoke reducers. However, reduce function could potentially take a lot of time to execute even for one value thus causing prolonged locking and congestions. We should implement this enhancement and do thorough performance measurements. This feature should be have an API setting in MapReduceTask.
Sanne Grinovero gets the credit for this idea.
[1] http://blog.infinispan.org/2012/07/mapreduce-improvements-in-infinispan.html
- relates to
-
ISPN-4022 M/R: Run the combiner concurrently with the mapper
- Closed