-
- Downloads
[SPARK-17829][SQL] Stable format for offset log
## What changes were proposed in this pull request? Currently we use java serialization for the WAL that stores the offsets contained in each batch. This has two main issues: It can break across spark releases (though this is not the only thing preventing us from upgrading a running query) It is unnecessarily opaque to the user. I'd propose we require offsets to provide a user readable serialization and use that instead. JSON is probably a good option. ## How was this patch tested? Tests were added for KafkaSourceOffset in [KafkaSourceOffsetSuite](external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceOffsetSuite.scala) and for LongOffset in [OffsetSuite](sql/core/src/test/scala/org/apache/spark/sql/streaming/OffsetSuite.scala) Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request. zsxwing marmbrus Author: Tyson Condie <tcondie@gmail.com> Author: Tyson Condie <tcondie@clash.local> Closes #15626 from tcondie/spark-8360.
Showing
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/JsonUtils.scala 0 additions, 2 deletions.../main/scala/org/apache/spark/sql/kafka010/JsonUtils.scala
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala 18 additions, 1 deletion...ain/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceOffset.scala 10 additions, 4 deletions...ala/org/apache/spark/sql/kafka010/KafkaSourceOffset.scala
- external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceOffsetSuite.scala 54 additions, 1 deletion...rg/apache/spark/sql/kafka010/KafkaSourceOffsetSuite.scala
- python/pyspark/sql/streaming.py 6 additions, 6 deletionspython/pyspark/sql/streaming.py
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala 10 additions, 13 deletions...rk/sql/execution/streaming/CompactibleFileStreamLog.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSinkLog.scala 0 additions, 8 deletions...che/spark/sql/execution/streaming/FileStreamSinkLog.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala 2 additions, 2 deletions...ache/spark/sql/execution/streaming/FileStreamSource.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceLog.scala 0 additions, 8 deletions...e/spark/sql/execution/streaming/FileStreamSourceLog.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala 13 additions, 9 deletions...pache/spark/sql/execution/streaming/HDFSMetadataLog.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/LongOffset.scala 20 additions, 1 deletion...org/apache/spark/sql/execution/streaming/LongOffset.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Offset.scala 35 additions, 1 deletion...ala/org/apache/spark/sql/execution/streaming/Offset.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala 9 additions, 6 deletions.../org/apache/spark/sql/execution/streaming/OffsetSeq.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeqLog.scala 80 additions, 0 deletions...g/apache/spark/sql/execution/streaming/OffsetSeqLog.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala 8 additions, 0 deletions...ala/org/apache/spark/sql/execution/streaming/Source.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala 5 additions, 6 deletions...pache/spark/sql/execution/streaming/StreamExecution.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamProgress.scala 2 additions, 2 deletions...apache/spark/sql/execution/streaming/StreamProgress.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala 17 additions, 15 deletions...ala/org/apache/spark/sql/execution/streaming/memory.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala 12 additions, 13 deletions...ala/org/apache/spark/sql/execution/streaming/socket.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryException.scala 3 additions, 3 deletions.../apache/spark/sql/streaming/StreamingQueryException.scala
Loading
Please register or sign in to comment