Skip to content
Snippets Groups Projects
Commit f44ead89 authored by aokolnychyi's avatar aokolnychyi Committed by gatorsmile
Browse files

[SPARK-21538][SQL] Attribute resolution inconsistency in the Dataset API

## What changes were proposed in this pull request?

This PR contains a tiny update that removes an attribute resolution inconsistency in the Dataset API. The following example is taken from the ticket description:

```
spark.range(1).withColumnRenamed("id", "x").sort(col("id"))  // works
spark.range(1).withColumnRenamed("id", "x").sort($"id")  // works
spark.range(1).withColumnRenamed("id", "x").sort('id) // works
spark.range(1).withColumnRenamed("id", "x").sort("id") // fails with:
org.apache.spark.sql.AnalysisException: Cannot resolve column name "id" among (x);
```
The above `AnalysisException` happens because the last case calls `Dataset.apply()` to convert strings into columns, which triggers attribute resolution. To make the API consistent between overloaded methods, this PR defers the resolution and constructs columns directly.

Author: aokolnychyi <anton.okolnychyi@sap.com>

Closes #18740 from aokolnychyi/spark-21538.
parent 9f5647d6
No related branches found
No related tags found
No related merge requests found
...@@ -1108,7 +1108,7 @@ class Dataset[T] private[sql]( ...@@ -1108,7 +1108,7 @@ class Dataset[T] private[sql](
*/ */
@scala.annotation.varargs @scala.annotation.varargs
def sort(sortCol: String, sortCols: String*): Dataset[T] = { def sort(sortCol: String, sortCols: String*): Dataset[T] = {
sort((sortCol +: sortCols).map(apply) : _*) sort((sortCol +: sortCols).map(Column(_)) : _*)
} }
/** /**
......
...@@ -1304,6 +1304,19 @@ class DatasetSuite extends QueryTest with SharedSQLContext { ...@@ -1304,6 +1304,19 @@ class DatasetSuite extends QueryTest with SharedSQLContext {
assert(rlike3.count() == 0) assert(rlike3.count() == 0)
} }
} }
test("SPARK-21538: Attribute resolution inconsistency in Dataset API") {
val df = spark.range(3).withColumnRenamed("id", "x")
val expected = Row(0) :: Row(1) :: Row (2) :: Nil
checkAnswer(df.sort("id"), expected)
checkAnswer(df.sort(col("id")), expected)
checkAnswer(df.sort($"id"), expected)
checkAnswer(df.sort('id), expected)
checkAnswer(df.orderBy("id"), expected)
checkAnswer(df.orderBy(col("id")), expected)
checkAnswer(df.orderBy($"id"), expected)
checkAnswer(df.orderBy('id), expected)
}
} }
case class WithImmutableMap(id: String, map_test: scala.collection.immutable.Map[Long, String]) case class WithImmutableMap(id: String, map_test: scala.collection.immutable.Map[Long, String])
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment