-
- Downloads
SPARK-2028: Expose mapPartitionsWithInputSplit in HadoopRDD
This allows users to gain access to the InputSplit which backs each partition. An alternative solution would have been to have a .withInputSplit() method which returns a new RDD[(InputSplit, (K, V))], but this is confusing because you could not cache this RDD or shuffle it, as InputSplit is not inherently serializable. Author: Aaron Davidson <aaron@databricks.com> Closes #973 from aarondav/hadoop and squashes the following commits: 9c9112b [Aaron Davidson] Add JavaAPISuite test 9942cd7 [Aaron Davidson] Add Java API 1284a3a [Aaron Davidson] SPARK-2028: Expose mapPartitionsWithInputSplit in HadoopRDD
Showing
- core/src/main/scala/org/apache/spark/api/java/JavaHadoopRDD.scala 43 additions, 0 deletions.../main/scala/org/apache/spark/api/java/JavaHadoopRDD.scala
- core/src/main/scala/org/apache/spark/api/java/JavaNewHadoopRDD.scala 43 additions, 0 deletions...in/scala/org/apache/spark/api/java/JavaNewHadoopRDD.scala
- core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala 13 additions, 8 deletions...in/scala/org/apache/spark/api/java/JavaSparkContext.scala
- core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala 32 additions, 0 deletionscore/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
- core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala 34 additions, 0 deletionscore/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala
- core/src/test/java/org/apache/spark/JavaAPISuite.java 25 additions, 1 deletioncore/src/test/java/org/apache/spark/JavaAPISuite.java
- core/src/test/scala/org/apache/spark/FileSuite.scala 32 additions, 2 deletionscore/src/test/scala/org/apache/spark/FileSuite.scala
Loading
Please register or sign in to comment