-
- Downloads
[SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_udf and add doctests
## What changes were proposed in this pull request? This change disables the use of 0-parameter pandas_udfs due to the API being overly complex and awkward, and can easily be worked around by using an index column as an input argument. Also added doctests for pandas_udfs which revealed bugs for handling empty partitions and using the pandas_udf decorator. ## How was this patch tested? Reworked existing 0-parameter test to verify error is raised, added doctest for pandas_udf, added new tests for empty partition and decorator usage. Author: Bryan Cutler <cutlerb@gmail.com> Closes #19325 from BryanCutler/arrow-pandas_udf-0-param-remove-SPARK-22106.
Showing
- python/pyspark/serializers.py 3 additions, 12 deletionspython/pyspark/serializers.py
- python/pyspark/sql/functions.py 25 additions, 7 deletionspython/pyspark/sql/functions.py
- python/pyspark/sql/tests.py 45 additions, 14 deletionspython/pyspark/sql/tests.py
- python/pyspark/worker.py 10 additions, 15 deletionspython/pyspark/worker.py
- sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala 6 additions, 4 deletions...ache/spark/sql/execution/python/ArrowEvalPythonExec.scala
Loading
Please register or sign in to comment