hyukjinkwon
authored
## What changes were proposed in this pull request? This PR make `sample(...)` able to omit `withReplacement` defaulting to `FALSE`. In short, the following examples are allowed: ```r > df <- createDataFrame(as.list(seq(10))) > count(sample(df, fraction=0.5, seed=3)) [1] 4 > count(sample(df, fraction=1.0)) [1] 10 ``` In addition, this PR also adds some type checking logics as below: ```r > sample(df, fraction = "a") Error in sample(df, fraction = "a") : fraction must be numeric; however, got character > sample(df, fraction = 1, seed = NULL) Error in sample(df, fraction = 1, seed = NULL) : seed must not be NULL or NA; however, got NULL > sample(df, list(1), 1.0) Error in sample(df, list(1), 1) : withReplacement must be logical; however, got list > sample(df, fraction = -1.0) ... Error in sample : illegal argument - requirement failed: Sampling fraction (-1.0) must be on interval [0, 1] without replacement ``` ## How was this patch tested? Manually tested, unit tests added in `R/pkg/tests/fulltests/test_sparkSQL.R`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #19243 from HyukjinKwon/SPARK-21780.