Skip to content
Snippets Groups Projects
Commit ecd877e8 authored by felixcheung's avatar felixcheung Committed by Shivaram Venkataraman
Browse files

[SPARK-12224][SPARKR] R support for JDBC source

Add R API for `read.jdbc`, `write.jdbc`.

Tested this quite a bit manually with different combinations of parameters. It's not clear if we could have automated tests in R for this - Scala `JDBCSuite` depends on Java H2 in-memory database.

Refactored some code into util so they could be tested.

Core's R SerDe code needs to be updated to allow access to java.util.Properties as `jobj` handle which is required by DataFrameReader/Writer's `jdbc` method. It would be possible, though more code to add a `sql/r/SQLUtils` helper function.

Tested:
```
# with postgresql
../bin/sparkR --driver-class-path /usr/share/java/postgresql-9.4.1207.jre7.jar

# read.jdbc
df <- read.jdbc(sqlContext, "jdbc:postgresql://localhost/db", "films2", user = "user", password = "12345")
df <- read.jdbc(sqlContext, "jdbc:postgresql://localhost/db", "films2", user = "user", password = 12345)

# partitionColumn and numPartitions test
df <- read.jdbc(sqlContext, "jdbc:postgresql://localhost/db", "films2", partitionColumn = "did", lowerBound = 0, upperBound = 200, numPartitions = 4, user = "user", password = 12345)
a <- SparkR:::toRDD(df)
SparkR:::getNumPartitions(a)
[1] 4
SparkR:::collectPartition(a, 2L)

# defaultParallelism test
df <- read.jdbc(sqlContext, "jdbc:postgresql://localhost/db", "films2", partitionColumn = "did", lowerBound = 0, upperBound = 200, user = "user", password = 12345)
SparkR:::getNumPartitions(a)
[1] 2

# predicates test
df <- read.jdbc(sqlContext, "jdbc:postgresql://localhost/db", "films2", predicates = list("did<=105"), user = "user", password = 12345)
count(df) == 1

# write.jdbc, default save mode "error"
irisDf <- as.DataFrame(sqlContext, iris)
write.jdbc(irisDf, "jdbc:postgresql://localhost/db", "films2", user = "user", password = "12345")
"error, already exists"

write.jdbc(irisDf, "jdbc:postgresql://localhost/db", "iris", user = "user", password = "12345")
```

Author: felixcheung <felixcheung_m@hotmail.com>

Closes #10480 from felixcheung/rreadjdbc.
parent 008a8bbe
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment