-
- Downloads
[SPARK-12919][SPARKR] Implement dapply() on DataFrame in SparkR.
## What changes were proposed in this pull request? dapply() applies an R function on each partition of a DataFrame and returns a new DataFrame. The function signature is: dapply(df, function(localDF) {}, schema = NULL) R function input: local data.frame from the partition on local node R function output: local data.frame Schema specifies the Row format of the resulting DataFrame. It must match the R function's output. If schema is not specified, each partition of the result DataFrame will be serialized in R into a single byte array. Such resulting DataFrame can be processed by successive calls to dapply(). ## How was this patch tested? SparkR unit tests. Author: Sun Rui <rui.sun@intel.com> Author: Sun Rui <sunrui2016@gmail.com> Closes #12493 from sun-rui/SPARK-12919.
Showing
- R/pkg/NAMESPACE 1 addition, 0 deletionsR/pkg/NAMESPACE
- R/pkg/R/DataFrame.R 61 additions, 0 deletionsR/pkg/R/DataFrame.R
- R/pkg/R/generics.R 4 additions, 0 deletionsR/pkg/R/generics.R
- R/pkg/inst/tests/testthat/test_sparkSQL.R 40 additions, 0 deletionsR/pkg/inst/tests/testthat/test_sparkSQL.R
- R/pkg/inst/worker/worker.R 35 additions, 1 deletionR/pkg/inst/worker/worker.R
- core/src/main/scala/org/apache/spark/api/r/RRDD.scala 1 addition, 1 deletioncore/src/main/scala/org/apache/spark/api/r/RRDD.scala
- core/src/main/scala/org/apache/spark/api/r/RRunner.scala 10 additions, 3 deletionscore/src/main/scala/org/apache/spark/api/r/RRunner.scala
- core/src/main/scala/org/apache/spark/api/r/SerDe.scala 1 addition, 1 deletioncore/src/main/scala/org/apache/spark/api/r/SerDe.scala
- docs/sql-programming-guide.md 5 additions, 0 deletionsdocs/sql-programming-guide.md
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala 9 additions, 4 deletions...a/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala 52 additions, 2 deletions.../org/apache/spark/sql/catalyst/plans/logical/object.scala
- sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 18 additions, 0 deletionssql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
- sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 29 additions, 3 deletions.../src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 3 additions, 0 deletions...cala/org/apache/spark/sql/execution/SparkStrategies.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala 68 additions, 0 deletions.../apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
Loading
Please register or sign in to comment