Skip to content
Snippets Groups Projects
Commit 8a911051 authored by Kai Jiang's avatar Kai Jiang Committed by Shivaram Venkataraman
Browse files

[MINOR][R][DOC] Fix R documentation generation instruction.

## What changes were proposed in this pull request?
changes in R/README.md

- Make step of generating SparkR document more clear.
- link R/DOCUMENTATION.md from R/README.md
- turn on some code syntax highlight in R/README.md

## How was this patch tested?
local test

Author: Kai Jiang <jiangkai@gmail.com>

Closes #13488 from vectorijk/R-Readme.
parent 372fa61f
No related branches found
No related tags found
No related merge requests found
# SparkR Documentation # SparkR Documentation
SparkR documentation is generated using in-source comments annotated using using SparkR documentation is generated by using in-source comments and annotated by using
`roxygen2`. After making changes to the documentation, to generate man pages, [`roxygen2`](https://cran.r-project.org/web/packages/roxygen2/index.html). After making changes to the documentation and generating man pages,
you can run the following from an R console in the SparkR home directory you can run the following from an R console in the SparkR home directory
```R
library(devtools) library(devtools)
devtools::document(pkg="./pkg", roclets=c("rd")) devtools::document(pkg="./pkg", roclets=c("rd"))
```
You can verify if your changes are good by running You can verify if your changes are good by running
R CMD check pkg/ R CMD check pkg/
...@@ -7,8 +7,7 @@ SparkR is an R package that provides a light-weight frontend to use Spark from R ...@@ -7,8 +7,7 @@ SparkR is an R package that provides a light-weight frontend to use Spark from R
Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`. Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script. By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
Example: Example:
```bash
```
# where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript # where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
export R_HOME=/home/username/R export R_HOME=/home/username/R
./install-dev.sh ./install-dev.sh
...@@ -20,8 +19,8 @@ export R_HOME=/home/username/R ...@@ -20,8 +19,8 @@ export R_HOME=/home/username/R
Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
``` ```bash
build/mvn -DskipTests -Psparkr package build/mvn -DskipTests -Psparkr package
``` ```
#### Running sparkR #### Running sparkR
...@@ -40,9 +39,8 @@ To set other options like driver memory, executor memory etc. you can pass in th ...@@ -40,9 +39,8 @@ To set other options like driver memory, executor memory etc. you can pass in th
#### Using SparkR from RStudio #### Using SparkR from RStudio
If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```R
```
# Set this to where Spark is installed # Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/username/spark") Sys.setenv(SPARK_HOME="/Users/username/spark")
# This line loads SparkR from the installed directory # This line loads SparkR from the installed directory
...@@ -59,25 +57,25 @@ Once you have made your changes, please include unit tests for them and run exis ...@@ -59,25 +57,25 @@ Once you have made your changes, please include unit tests for them and run exis
#### Generating documentation #### Generating documentation
The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script. The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script. Also, you may need to install these [prerequisites](https://github.com/apache/spark/tree/master/docs#prerequisites). See also, `R/DOCUMENTATION.md`
### Examples, Unit tests ### Examples, Unit tests
SparkR comes with several sample programs in the `examples/src/main/r` directory. SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/spark-submit <filename> <args>`. For example: To run one of them, use `./bin/spark-submit <filename> <args>`. For example:
```bash
./bin/spark-submit examples/src/main/r/dataframe.R ./bin/spark-submit examples/src/main/r/dataframe.R
```
You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first: You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
```bash
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")' R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh ./R/run-tests.sh
```
### Running on YARN ### Running on YARN
The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```bash
```
export YARN_CONF_DIR=/etc/hadoop/conf export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R ./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
``` ```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment