-
- Downloads
[SPARK-15718][SQL] better error message for writing bucketed data
## What changes were proposed in this pull request? Currently we don't support bucketing for `save` and `insertInto`. For `save`, we just write the data out into a directory users specified, and it's not a table, we don't keep its metadata. When we read it back, we have no idea if the data is bucketed or not, so it doesn't make sense to use `save` to write bucketed data, as we can't use the bucket information anyway. We can support it in the future, once we have features like bucket discovery, or we save bucket information in the data directory too, so that we don't need to rely on a metastore. For `insertInto`, it inserts data into an existing table, so it doesn't make sense to specify bucket information, as we should get the bucket information from the existing table. This PR improves the error message for the above 2 cases. ## How was this patch tested? new test in `BukctedWriteSuite` Author: Wenchen Fan <wenchen@databricks.com> Closes #13452 from cloud-fan/error-msg.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala 5 additions, 5 deletions...src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
- sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataFrameReaderWriterSuite.scala 2 additions, 2 deletions...spark/sql/streaming/test/DataFrameReaderWriterSuite.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala 15 additions, 4 deletions...ala/org/apache/spark/sql/sources/BucketedWriteSuite.scala
Loading
Please register or sign in to comment