-
- Downloads
[SPARK-22548][SQL] Incorrect nested AND expression pushed down to JDBC data source
## What changes were proposed in this pull request? Let’s say I have a nested AND expression shown below and p2 can not be pushed down, (p1 AND p2) OR p3 In current Spark code, during data source filter translation, (p1 AND p2) is returned as p1 only and p2 is simply lost. This issue occurs with JDBC data source and is similar to [SPARK-12218](https://github.com/apache/spark/pull/10362) for Parquet. When we have AND nested below another expression, we should either push both legs or nothing. Note that: - The current Spark code will always split conjunctive predicate before it determines if a predicate can be pushed down or not - If I have (p1 AND p2) AND p3, it will be split into p1, p2, p3. There won't be nested AND expression. - The current Spark code logic for OR is OK. It either pushes both legs or nothing. The same translation method is also called by Data Source V2. ## How was this patch tested? Added new unit test cases to JDBCSuite gatorsmile Author: Jia Li <jiali@us.ibm.com> Closes #19776 from jliwork/spark-22548.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala 13 additions, 1 deletion.../spark/sql/execution/datasources/DataSourceStrategy.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala 231 additions, 0 deletions...k/sql/execution/datasources/DataSourceStrategySuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 6 additions, 3 deletions.../src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala
Loading
Please register or sign in to comment