Skip to content
Snippets Groups Projects
Commit 4243bb66 authored by Chia-Yung Su's avatar Chia-Yung Su Committed by Michael Armbrust
Browse files

[SPARK-3011][SQL] _temporary directory should be filtered out by sqlContext.parquetFile

fix compile error on hadoop 0.23 for the pull request #1924.

Author: Chia-Yung Su <chiayung@appier.com>

Closes #1959 from joesu/bugfix-spark3011 and squashes the following commits:

be30793 [Chia-Yung Su] remove .* and _* except _metadata
8fe2398 [Chia-Yung Su] add note to explain
40ea9bd [Chia-Yung Su] fix hadoop-0.23 compile error
c7e44f2 [Chia-Yung Su] match syntax
f8fc32a [Chia-Yung Su] filter out tmp dir
parent 507a1b52
No related branches found
No related tags found
No related merge requests found
...@@ -378,7 +378,7 @@ private[parquet] object ParquetTypesConverter extends Logging { ...@@ -378,7 +378,7 @@ private[parquet] object ParquetTypesConverter extends Logging {
val children = fs.listStatus(path).filterNot { status => val children = fs.listStatus(path).filterNot { status =>
val name = status.getPath.getName val name = status.getPath.getName
name(0) == '.' || name == FileOutputCommitter.SUCCEEDED_FILE_NAME (name(0) == '.' || name(0) == '_') && name != ParquetFileWriter.PARQUET_METADATA_FILE
} }
// NOTE (lian): Parquet "_metadata" file can be very slow if the file consists of lots of row // NOTE (lian): Parquet "_metadata" file can be very slow if the file consists of lots of row
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment