-
- Downloads
[SPARK-15517][SQL][STREAMING] Add support for complete output mode in Structure Streaming
## What changes were proposed in this pull request? Currently structured streaming only supports append output mode. This PR adds the following. - Added support for Complete output mode in the internal state store, analyzer and planner. - Added public API in Scala and Python for users to specify output mode - Added checks for unsupported combinations of output mode and DF operations - Plans with no aggregation should support only Append mode - Plans with aggregation should support only Update and Complete modes - Default output mode is Append mode (**Question: should we change this to automatically set to Complete mode when there is aggregation?**) - Added support for Complete output mode in Memory Sink. So Memory Sink internally supports append and complete, update. But from public API only Complete and Append output modes are supported. ## How was this patch tested? Unit tests in various test suites - StreamingAggregationSuite: tests for complete mode - MemorySinkSuite: tests for checking behavior in Append and Complete modes. - UnsupportedOperationSuite: tests for checking unsupported combinations of DF ops and output modes - DataFrameReaderWriterSuite: tests for checking that output mode cannot be called on static DFs - Python doc test and existing unit tests modified to call write.outputMode. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #13286 from tdas/complete-mode.
Showing
- python/pyspark/sql/readwriter.py 20 additions, 0 deletionspython/pyspark/sql/readwriter.py
- python/pyspark/sql/tests.py 4 additions, 3 deletionspython/pyspark/sql/tests.py
- sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 54 additions, 0 deletions...talyst/src/main/java/org/apache/spark/sql/OutputMode.java
- sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala 45 additions, 0 deletions...main/scala/org/apache/spark/sql/InternalOutputModes.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala 27 additions, 22 deletions...k/sql/catalyst/analysis/UnsupportedOperationChecker.scala
- sql/catalyst/src/test/java/org/apache/spark/sql/JavaOutputModeSuite.java 12 additions, 4 deletions...c/test/java/org/apache/spark/sql/JavaOutputModeSuite.java
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala 38 additions, 25 deletions...rk/sql/catalyst/analysis/UnsupportedOperationsSuite.scala
- sql/core/src/main/scala/org/apache/spark/sql/ContinuousQueryManager.scala 4 additions, 4 deletions...n/scala/org/apache/spark/sql/ContinuousQueryManager.scala
- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala 47 additions, 3 deletions...src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala 4 additions, 2 deletions...cala/org/apache/spark/sql/execution/aggregate/utils.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala 7 additions, 2 deletions...g/apache/spark/sql/execution/datasources/DataSource.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala 5 additions, 3 deletions.../spark/sql/execution/streaming/IncrementalExecution.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulAggregate.scala 51 additions, 19 deletions...che/spark/sql/execution/streaming/StatefulAggregate.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala 0 additions, 1 deletion...pache/spark/sql/execution/streaming/StreamExecution.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/console.scala 3 additions, 2 deletions...la/org/apache/spark/sql/execution/streaming/console.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala 27 additions, 13 deletions...ala/org/apache/spark/sql/execution/streaming/memory.scala
- sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala 2 additions, 1 deletion.../main/scala/org/apache/spark/sql/sources/interfaces.scala
- sql/core/src/test/scala/org/apache/spark/sql/StreamTest.scala 8 additions, 8 deletions...core/src/test/scala/org/apache/spark/sql/StreamTest.scala
- sql/core/src/test/scala/org/apache/spark/sql/streaming/ContinuousQueryManagerSuite.scala 11 additions, 11 deletions...che/spark/sql/streaming/ContinuousQueryManagerSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/streaming/MemorySinkSuite.scala 175 additions, 8 deletions...cala/org/apache/spark/sql/streaming/MemorySinkSuite.scala
Loading
Please register or sign in to comment