Skip to content
Snippets Groups Projects
Commit e52b8719 authored by Sandy Ryza's avatar Sandy Ryza Committed by Matei Zaharia
Browse files

SPARK-2553. CoGroupedRDD unnecessarily allocates a Tuple2 per dependency...

... per key

My humble opinion is that avoiding allocations in this performance-critical section is worth the extra code.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #1461 from sryza/sandy-spark-2553 and squashes the following commits:

7eaf7f2 [Sandy Ryza] SPARK-2553. CoGroupedRDD unnecessarily allocates a Tuple2 per dependency per key
parent 29809a6d
No related branches found
No related tags found
No related merge requests found
...@@ -180,7 +180,11 @@ class CoGroupedRDD[K](@transient var rdds: Seq[RDD[_ <: Product2[K, _]]], part: ...@@ -180,7 +180,11 @@ class CoGroupedRDD[K](@transient var rdds: Seq[RDD[_ <: Product2[K, _]]], part:
} }
val mergeCombiners: (CoGroupCombiner, CoGroupCombiner) => CoGroupCombiner = val mergeCombiners: (CoGroupCombiner, CoGroupCombiner) => CoGroupCombiner =
(combiner1, combiner2) => { (combiner1, combiner2) => {
combiner1.zip(combiner2).map { case (v1, v2) => v1 ++ v2 } var depNum = 0
while (depNum < numRdds) {
combiner1(depNum) ++= combiner2(depNum)
depNum += 1
}
} }
new ExternalAppendOnlyMap[K, CoGroupValue, CoGroupCombiner]( new ExternalAppendOnlyMap[K, CoGroupValue, CoGroupCombiner](
createCombiner, mergeValue, mergeCombiners) createCombiner, mergeValue, mergeCombiners)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment