-
- Downloads
[SPARK-22409] Introduce function type argument in pandas_udf
## What changes were proposed in this pull request? * Add a "function type" argument to pandas_udf. * Add a new public enum class `PandasUdfType` in pyspark.sql.functions * Refactor udf related code from pyspark.sql.functions to pyspark.sql.udf * Merge "PythonUdfType" and "PythonEvalType" into a single enum class "PythonEvalType" Example: ``` from pyspark.sql.functions import pandas_udf, PandasUDFType pandas_udf('double', PandasUDFType.SCALAR): def plus_one(v): return v + 1 ``` ## Design doc https://docs.google.com/document/d/1KlLaa-xJ3oz28xlEJqXyCAHU3dwFYkFs_ixcUXrJNTc/edit ## How was this patch tested? Added PandasUDFTests ## TODO: * [x] Implement proper enum type for `PandasUDFType` * [x] Update documentation * [x] Add more tests in PandasUDFTests Author: Li Jin <ice.xelloss@gmail.com> Closes #19630 from icexelloss/spark-22409-pandas-udf-type.
Showing
- core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala 5 additions, 3 deletions...main/scala/org/apache/spark/api/python/PythonRunner.scala
- python/pyspark/rdd.py 16 additions, 0 deletionspython/pyspark/rdd.py
- python/pyspark/serializers.py 0 additions, 7 deletionspython/pyspark/serializers.py
- python/pyspark/sql/catalog.py 4 additions, 3 deletionspython/pyspark/sql/catalog.py
- python/pyspark/sql/functions.py 77 additions, 145 deletionspython/pyspark/sql/functions.py
- python/pyspark/sql/group.py 9 additions, 40 deletionspython/pyspark/sql/group.py
- python/pyspark/sql/tests.py 168 additions, 53 deletionspython/pyspark/sql/tests.py
- python/pyspark/sql/udf.py 161 additions, 0 deletionspython/pyspark/sql/udf.py
- python/pyspark/worker.py 30 additions, 9 deletionspython/pyspark/worker.py
- sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala 5 additions, 4 deletions...scala/org/apache/spark/sql/RelationalGroupedDataset.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala 1 addition, 1 deletion...ache/spark/sql/execution/python/ArrowEvalPythonExec.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala 8 additions, 4 deletions...apache/spark/sql/execution/python/ExtractPythonUDFs.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/python/FlatMapGroupsInPandasExec.scala 1 addition, 1 deletion...park/sql/execution/python/FlatMapGroupsInPandasExec.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDF.scala 1 addition, 1 deletion...ala/org/apache/spark/sql/execution/python/PythonUDF.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonFunction.scala 2 additions, 11 deletions...park/sql/execution/python/UserDefinedPythonFunction.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/python/BatchEvalPythonExecSuite.scala 2 additions, 2 deletions...spark/sql/execution/python/BatchEvalPythonExecSuite.scala
Loading
Please register or sign in to comment