-
- Downloads
[SPARK-13926] Automatically use Kryo serializer when shuffling RDDs with simple types
Because ClassTags are available when constructing ShuffledRDD we can use them to automatically use Kryo for shuffle serialization when the RDD's types are known to be compatible with Kryo. This patch introduces `SerializerManager`, a component which picks the "best" serializer for a shuffle given the elements' ClassTags. It will automatically pick a Kryo serializer for ShuffledRDDs whose key, value, and/or combiner types are primitives, arrays of primitives, or strings. In the future we can use this class as a narrow extension point to integrate specialized serializers for other types, such as ByteBuffers. In a planned followup patch, I will extend the BlockManager APIs so that we're able to use similar automatic serializer selection when caching RDDs (this is a little trickier because the ClassTags need to be threaded through many more places). Author: Josh Rosen <joshrosen@databricks.com> Closes #11755 from JoshRosen/automatically-pick-best-serializer.
Showing
- core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java 1 addition, 1 deletion...ache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java
- core/src/main/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriter.java 1 addition, 1 deletion...va/org/apache/spark/shuffle/sort/UnsafeShuffleWriter.java
- core/src/main/scala/org/apache/spark/Dependency.scala 4 additions, 4 deletionscore/src/main/scala/org/apache/spark/Dependency.scala
- core/src/main/scala/org/apache/spark/SparkEnv.scala 5 additions, 1 deletioncore/src/main/scala/org/apache/spark/SparkEnv.scala
- core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala 2 additions, 2 deletionscore/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala
- core/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala 10 additions, 2 deletionscore/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala
- core/src/main/scala/org/apache/spark/rdd/SubtractedRDD.scala 1 addition, 9 deletionscore/src/main/scala/org/apache/spark/rdd/SubtractedRDD.scala
- core/src/main/scala/org/apache/spark/serializer/Serializer.scala 0 additions, 12 deletions...c/main/scala/org/apache/spark/serializer/Serializer.scala
- core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala 71 additions, 0 deletions...scala/org/apache/spark/serializer/SerializerManager.scala
- core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala 2 additions, 3 deletions...la/org/apache/spark/shuffle/BlockStoreShuffleReader.scala
- core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleWriter.scala 2 additions, 4 deletions...ala/org/apache/spark/shuffle/hash/HashShuffleWriter.scala
- core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala 2 additions, 4 deletions...la/org/apache/spark/shuffle/sort/SortShuffleManager.scala
- core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala 2 additions, 3 deletions...ala/org/apache/spark/util/collection/ExternalSorter.scala
- core/src/test/java/org/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java 1 addition, 1 deletion...g/apache/spark/shuffle/sort/UnsafeShuffleWriterSuite.java
- core/src/test/scala/org/apache/spark/shuffle/BlockStoreShuffleReaderSuite.scala 1 addition, 1 deletion...g/apache/spark/shuffle/BlockStoreShuffleReaderSuite.scala
- core/src/test/scala/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriterSuite.scala 1 addition, 1 deletion...park/shuffle/sort/BypassMergeSortShuffleWriterSuite.scala
- core/src/test/scala/org/apache/spark/shuffle/sort/SortShuffleManagerSuite.scala 4 additions, 4 deletions...g/apache/spark/shuffle/sort/SortShuffleManagerSuite.scala
- core/src/test/scala/org/apache/spark/util/collection/ExternalSorterSuite.scala 17 additions, 17 deletions...rg/apache/spark/util/collection/ExternalSorterSuite.scala
- project/MimaExcludes.scala 5 additions, 0 deletionsproject/MimaExcludes.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala 1 addition, 1 deletion...apache/spark/sql/execution/exchange/ShuffleExchange.scala
Loading
Please register or sign in to comment