-
- Downloads
[SPARK-13667][SQL] Support for specifying custom date format for date and...
[SPARK-13667][SQL] Support for specifying custom date format for date and timestamp types at CSV datasource. ## What changes were proposed in this pull request? This PR adds the support to specify custom date format for `DateType` and `TimestampType`. For `TimestampType`, this uses the given format to infer schema and also to convert the values For `DateType`, this uses the given format to convert the values. If the `dateFormat` is not given, then it works with `DateTimeUtils.stringToTime()` for backwords compatibility. When it's given, then it uses `SimpleDateFormat` for parsing data. In addition, `IntegerType`, `DoubleType` and `LongType` have a higher priority than `TimestampType` in type inference. This means even if the given format is `yyyy` or `yyyy.MM`, it will be inferred as `IntegerType` or `DoubleType`. Since it is type inference, I think it is okay to give such precedences. In addition, I renamed `csv.CSVInferSchema` to `csv.InferSchema` as JSON datasource has `json.InferSchema`. Although they have the same names, I did this because I thought the parent package name can still differentiate each. Accordingly, the suite name was also changed from `CSVInferSchemaSuite` to `InferSchemaSuite`. ## How was this patch tested? unit tests are used and `./dev/run_tests` for coding style tests. Author: hyukjinkwon <gurwls223@gmail.com> Closes #11550 from HyukjinKwon/SPARK-13667.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala 50 additions, 32 deletions.../spark/sql/execution/datasources/csv/CSVInferSchema.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala 7 additions, 0 deletions...ache/spark/sql/execution/datasources/csv/CSVOptions.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala 2 additions, 1 deletion...che/spark/sql/execution/datasources/csv/CSVRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala 1 addition, 1 deletion...e/spark/sql/execution/datasources/csv/DefaultSource.scala
- sql/core/src/test/resources/dates.csv 4 additions, 0 deletionssql/core/src/test/resources/dates.csv
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchemaSuite.scala 47 additions, 31 deletions...k/sql/execution/datasources/csv/CSVInferSchemaSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala 51 additions, 1 deletion...apache/spark/sql/execution/datasources/csv/CSVSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVTypeCastSuite.scala 11 additions, 0 deletions...park/sql/execution/datasources/csv/CSVTypeCastSuite.scala
Loading
Please register or sign in to comment