Skip to content
Snippets Groups Projects
Commit 3e48f3b9 authored by Dongjoon Hyun's avatar Dongjoon Hyun Committed by Shixiong Zhu
Browse files

[SPARK-23434][SQL] Spark should not warn `metadata directory` for a HDFS file path

## What changes were proposed in this pull request?

In a kerberized cluster, when Spark reads a file path (e.g. `people.json`), it warns with a wrong warning message during looking up `people.json/_spark_metadata`. The root cause of this situation is the difference between `LocalFileSystem` and `DistributedFileSystem`. `LocalFileSystem.exists()` returns `false`, but `DistributedFileSystem.exists` raises `org.apache.hadoop.security.AccessControlException`.

```scala
scala> spark.version
res0: String = 2.4.0-SNAPSHOT

scala> spark.read.json("file:///usr/hdp/current/spark-client/examples/src/main/resources/people.json").show
+----+-------+
| age|   name|
+----+-------+
|null|Michael|
|  30|   Andy|
|  19| Justin|
+----+-------+

scala> spark.read.json("hdfs:///tmp/people.json")
18/02/15 05:00:48 WARN streaming.FileStreamSink: Error while looking for metadata directory.
18/02/15 05:00:48 WARN streaming.FileStreamSink: Error while looking for metadata directory.
```

After this PR,
```scala
scala> spark.read.json("hdfs:///tmp/people.json").show
+----+-------+
| age|   name|
+----+-------+
|null|Michael|
|  30|   Andy|
|  19| Justin|
+----+-------+
```

## How was this patch tested?

Manual.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #20616 from dongjoon-hyun/SPARK-23434.
parent 83c00876
No related branches found
No related tags found
No related merge requests found
......@@ -42,9 +42,11 @@ object FileStreamSink extends Logging {
try {
val hdfsPath = new Path(singlePath)
val fs = hdfsPath.getFileSystem(hadoopConf)
val metadataPath = new Path(hdfsPath, metadataDir)
val res = fs.exists(metadataPath)
res
if (fs.isDirectory(hdfsPath)) {
fs.exists(new Path(hdfsPath, metadataDir))
} else {
false
}
} catch {
case NonFatal(e) =>
logWarning(s"Error while looking for metadata directory.")
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment