Skip to content
Snippets Groups Projects
  1. Aug 04, 2015
    • Burak Yavuz's avatar
      [SPARK-8313] R Spark packages support · c9a4c36d
      Burak Yavuz authored
      shivaram cafreeman Could you please help me in testing this out? Exposing and running `rPackageBuilder` from inside the shell works, but for some reason, I can't get it to work during Spark Submit. It just starts relaunching Spark Submit.
      
      For testing, you may use the R branch with [sbt-spark-package](https://github.com/databricks/sbt-spark-package). You can call spPackage, and then pass the jar using `--jars`.
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #7139 from brkyvz/r-submit and squashes the following commits:
      
      0de384f [Burak Yavuz] remove unused imports 2
      d253708 [Burak Yavuz] removed unused imports
      6603d0d [Burak Yavuz] addressed comments
      4258ffe [Burak Yavuz] merged master
      ddfcc06 [Burak Yavuz] added zipping test
      3a1be7d [Burak Yavuz] don't zip
      77995df [Burak Yavuz] fix URI
      ac45527 [Burak Yavuz] added zipping of all libs
      e6bf7b0 [Burak Yavuz] add println ignores
      1bc5554 [Burak Yavuz] add assumes for tests
      9778e03 [Burak Yavuz] addressed comments
      b42b300 [Burak Yavuz] merged master
      ffd134e [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into r-submit
      d867756 [Burak Yavuz] add apache header
      eff5ba1 [Burak Yavuz] ready for review
      8838edb [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into r-submit
      e5b5a06 [Burak Yavuz] added doc
      bb751ce [Burak Yavuz] fix null bug
      0226768 [Burak Yavuz] fixed issues
      8810beb [Burak Yavuz] R packages support
      c9a4c36d
  2. Jul 31, 2015
    • Hossein's avatar
      [SPARK-9318] [SPARK-9320] [SPARKR] Aliases for merge and summary functions on DataFrames · 712f5b7a
      Hossein authored
      This PR adds synonyms for ```merge``` and ```summary``` in SparkR DataFrame API.
      
      cc shivaram
      
      Author: Hossein <hossein@databricks.com>
      
      Closes #7806 from falaki/SPARK-9320 and squashes the following commits:
      
      72600f7 [Hossein] Updated docs
      92a6e75 [Hossein] Fixed merge generic signature issue
      4c2b051 [Hossein] Fixing naming with mllib summary
      0f3a64c [Hossein] Added ... to generic for merge
      30fbaf8 [Hossein] Merged master
      ae1a4cf [Hossein] Merge branch 'master' into SPARK-9320
      e8eb86f [Hossein] Add a generic for merge
      fc01f2d [Hossein] Added unit test
      8d92012 [Hossein] Added merge as an alias for join
      5b8bedc [Hossein] Added unit test
      632693d [Hossein] Added summary as an alias for describe for DataFrame
      712f5b7a
    • Hossein's avatar
      [SPARK-9324] [SPARK-9322] [SPARK-9321] [SPARKR] Some aliases for R-like functions in DataFrames · 710c2b5d
      Hossein authored
      Adds following aliases:
      * unique (distinct)
      * rbind (unionAll): accepts many DataFrames
      * nrow (count)
      * ncol
      * dim
      * names (columns): along with the replacement function to change names
      
      Author: Hossein <hossein@databricks.com>
      
      Closes #7764 from falaki/sparkR-alias and squashes the following commits:
      
      56016f5 [Hossein] Updated R documentation
      5e4a4d0 [Hossein] Removed extra code
      f51cbef [Hossein] Merge branch 'master' into sparkR-alias
      c1b88bd [Hossein] Moved setGeneric and other comments applied
      d9307f8 [Hossein] Added tests
      b5aa988 [Hossein] Added dim, ncol, nrow, names, rbind, and unique functions to DataFrames
      710c2b5d
    • Shivaram Venkataraman's avatar
      [SPARK-9510] [SPARKR] Remaining SparkR style fixes · 82f47b81
      Shivaram Venkataraman authored
      With the change in this patch, I get no more warnings from `./dev/lint-r` in my machine
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      
      Closes #7834 from shivaram/sparkr-style-fixes and squashes the following commits:
      
      716cd8e [Shivaram Venkataraman] Remaining SparkR style fixes
      82f47b81
    • Yu ISHIKAWA's avatar
      [SPARK-9053] [SPARKR] Fix spaces around parens, infix operators etc. · fc0e57e5
      Yu ISHIKAWA authored
      ### JIRA
      [[SPARK-9053] Fix spaces around parens, infix operators etc. - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9053)
      
      ### The Result of `lint-r`
      [The result of lint-r at the rivision:a4c83cb1](https://gist.github.com/yu-iskw/d253d7f8ef351f86443d)
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #7584 from yu-iskw/SPARK-9053 and squashes the following commits:
      
      613170f [Yu ISHIKAWA] Ignore a warning about a space before a left parentheses
      ede61e1 [Yu ISHIKAWA] Ignores two warnings about a space before a left parentheses. TODO: After updating `lintr`, we will remove the ignores
      de3e0db [Yu ISHIKAWA] Add '## nolint start' & '## nolint end' statement to ignore infix space warnings
      e233ea8 [Yu ISHIKAWA] [SPARK-9053][SparkR] Fix spaces around parens, infix operators etc.
      fc0e57e5
  3. Jul 30, 2015
    • Hossein's avatar
      [SPARK-8742] [SPARKR] Improve SparkR error messages for DataFrame API · 157840d1
      Hossein authored
      This patch improves SparkR error message reporting, especially with DataFrame API. When there is a user error (e.g., malformed SQL query), the message of the cause is sent back through the RPC and the R client reads it and returns it back to user.
      
      cc shivaram
      
      Author: Hossein <hossein@databricks.com>
      
      Closes #7742 from falaki/SPARK-8742 and squashes the following commits:
      
      4f643c9 [Hossein] Not logging exceptions in RBackendHandler
      4a8005c [Hossein] Returning stack track of causing exception from RBackendHandler
      5cf17f0 [Hossein] Adding unit test for error messages from SQLContext
      2af75d5 [Hossein] Reading error message in case of failure and stoping with that message
      f479c99 [Hossein] Wrting exception cause message in JVM
      157840d1
    • Eric Liang's avatar
      [SPARK-9463] [ML] Expose model coefficients with names in SparkR RFormula · e7905a93
      Eric Liang authored
      Preview:
      
      ```
      > summary(m)
                  features coefficients
      1        (Intercept)    1.6765001
      2       Sepal_Length    0.3498801
      3 Species.versicolor   -0.9833885
      4  Species.virginica   -1.0075104
      
      ```
      
      Design doc from umbrella task: https://docs.google.com/document/d/10NZNSEurN2EdWM31uFYsgayIPfCFHiuIu3pCWrUmP_c/edit
      
      cc mengxr
      
      Author: Eric Liang <ekl@databricks.com>
      
      Closes #7771 from ericl/summary and squashes the following commits:
      
      ccd54c3 [Eric Liang] second pass
      a5ca93b [Eric Liang] comments
      2772111 [Eric Liang] clean up
      70483ef [Eric Liang] fix test
      7c247d4 [Eric Liang] Merge branch 'master' into summary
      3c55024 [Eric Liang] working
      8c539aa [Eric Liang] first pass
      e7905a93
    • Josh Rosen's avatar
      [SPARK-8850] [SQL] Enable Unsafe mode by default · 520ec0ff
      Josh Rosen authored
      This pull request enables Unsafe mode by default in Spark SQL. In order to do this, we had to fix a number of small issues:
      
      **List of fixed blockers**:
      
      - [x] Make some default buffer sizes configurable so that HiveCompatibilitySuite can run properly (#7741).
      - [x] Memory leak on grouped aggregation of empty input (fixed by #7560 to fix this)
      - [x] Update planner to also check whether codegen is enabled before planning unsafe operators.
      - [x] Investigate failing HiveThriftBinaryServerSuite test.  This turns out to be caused by a ClassCastException that occurs when Exchange tries to apply an interpreted RowOrdering to an UnsafeRow when range partitioning an RDD.  This could be fixed by #7408, but a shorter-term fix is to just skip the Unsafe exchange path when RangePartitioner is used.
      - [x] Memory leak exceptions masking exceptions that actually caused tasks to fail (will be fixed by #7603).
      - [x]  ~~https://issues.apache.org/jira/browse/SPARK-9162, to implement code generation for ScalaUDF.  This is necessary for `UDFSuite` to pass.  For now, I've just ignored this test in order to try to find other problems while we wait for a fix.~~ This is no longer necessary as of #7682.
      - [x] Memory leaks from Limit after UnsafeExternalSort cause the memory leak detector to fail tests. This is a huge problem in the HiveCompatibilitySuite (fixed by f4ac642a4e5b2a7931c5e04e086bb10e263b1db6).
      - [x] Tests in `AggregationQuerySuite` are failing due to NaN-handling issues in UnsafeRow, which were fixed in #7736.
      - [x] `org.apache.spark.sql.ColumnExpressionSuite.rand` needs to be updated so that the planner check also matches `TungstenProject`.
      - [x] After having lowered the buffer sizes to 4MB so that most of HiveCompatibilitySuite runs:
        - [x] Wrong answer in `join_1to1` (fixed by #7680)
        - [x] Wrong answer in `join_nulls` (fixed by #7680)
        - [x] Managed memory OOM / leak in `lateral_view`
        - [x] Seems to hang indefinitely in `partcols1`.  This might be a deadlock in script transformation or a bug in error-handling code? The hang was fixed by #7710.
        - [x] Error while freeing memory in `partcols1`: will be fixed by #7734.
      - [x] After fixing the `partcols1` hang, it appears that a number of later tests have issues as well.
      - [x] Fix thread-safety bug in codegen fallback expression evaluation (#7759).
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #7564 from JoshRosen/unsafe-by-default and squashes the following commits:
      
      83c0c56 [Josh Rosen] Merge remote-tracking branch 'origin/master' into unsafe-by-default
      f4cc859 [Josh Rosen] Merge remote-tracking branch 'origin/master' into unsafe-by-default
      963f567 [Josh Rosen] Reduce buffer size for R tests
      d6986de [Josh Rosen] Lower page size in PySpark tests
      013b9da [Josh Rosen] Also match TungstenProject in checkNumProjects
      5d0b2d3 [Josh Rosen] Add task completion callback to avoid leak in limit after sort
      ea250da [Josh Rosen] Disable unsafe Exchange path when RangePartitioning is used
      715517b [Josh Rosen] Enable Unsafe by default
      520ec0ff
    • Yuu ISHIKAWA's avatar
      [SPARK-9248] [SPARKR] Closing curly-braces should always be on their own line · 7492a33f
      Yuu ISHIKAWA authored
      ### JIRA
      [[SPARK-9248] Closing curly-braces should always be on their own line - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9248)
      
      ## The result of `dev/lint-r`
      [The result of `dev/lint-r` for SPARK-9248 at the revistion:6175d6cf](https://gist.github.com/yu-iskw/96cadcea4ce664c41f81)
      
      Author: Yuu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #7795 from yu-iskw/SPARK-9248 and squashes the following commits:
      
      c8eccd3 [Yuu ISHIKAWA] [SPARK-9248][SparkR] Closing curly-braces should always be on their own line
      7492a33f
  4. Jul 28, 2015
    • Eric Liang's avatar
      [SPARK-9391] [ML] Support minus, dot, and intercept operators in SparkR RFormula · 8d5bb528
      Eric Liang authored
      Adds '.', '-', and intercept parsing to RFormula. Also splits RFormulaParser into a separate file.
      
      Umbrella design doc here: https://docs.google.com/document/d/10NZNSEurN2EdWM31uFYsgayIPfCFHiuIu3pCWrUmP_c/edit?usp=sharing
      
      mengxr
      
      Author: Eric Liang <ekl@databricks.com>
      
      Closes #7707 from ericl/string-features-2 and squashes the following commits:
      
      8588625 [Eric Liang] exclude complex types for .
      8106ffe [Eric Liang] comments
      a9350bb [Eric Liang] s/var/val
      9c50d4d [Eric Liang] Merge branch 'string-features' into string-features-2
      581afb2 [Eric Liang] Merge branch 'master' into string-features
      08ae539 [Eric Liang] Merge branch 'string-features' into string-features-2
      f99131a [Eric Liang] comments
      cecec43 [Eric Liang] Merge branch 'string-features' into string-features-2
      0bf3c26 [Eric Liang] update docs
      4592df2 [Eric Liang] intercept supports
      7412a2e [Eric Liang] Fri Jul 24 14:56:51 PDT 2015
      3cf848e [Eric Liang] fix the parser
      0556c2b [Eric Liang] Merge branch 'string-features' into string-features-2
      c302a2c [Eric Liang] fix tests
      9d1ac82 [Eric Liang] Merge remote-tracking branch 'upstream/master' into string-features
      e713da3 [Eric Liang] comments
      cd231a9 [Eric Liang] Wed Jul 22 17:18:44 PDT 2015
      4d79193 [Eric Liang] revert to seq + distinct
      169a085 [Eric Liang] tweak functional test
      a230a47 [Eric Liang] Merge branch 'master' into string-features
      72bd6f3 [Eric Liang] fix merge
      d841cec [Eric Liang] Merge branch 'master' into string-features
      5b2c4a2 [Eric Liang] Mon Jul 20 18:45:33 PDT 2015
      b01c7c5 [Eric Liang] add test
      8a637db [Eric Liang] encoder wip
      a1d03f4 [Eric Liang] refactor into estimator
      8d5bb528
    • trestletech's avatar
      Use vector-friendly comparison for packages argument. · 61432340
      trestletech authored
      Otherwise, `sparkR.init()` with multiple `sparkPackages` results in this warning:
      
      ```
      Warning message:
      In if (packages != "") { :
        the condition has length > 1 and only the first element will be used
      ```
      
      Author: trestletech <jeff.allen@trestletechnology.net>
      
      Closes #7701 from trestletech/compare-packages and squashes the following commits:
      
      72c8b36 [trestletech] Correct function name.
      c52db0e [trestletech] Added test for multiple packages.
      3aab1a7 [trestletech] Use vector-friendly comparison for packages argument.
      61432340
  5. Jul 27, 2015
    • Eric Liang's avatar
      [SPARK-9230] [ML] Support StringType features in RFormula · 8ddfa52c
      Eric Liang authored
      This adds StringType feature support via OneHotEncoder. As part of this task it was necessary to change RFormula to an Estimator, so that factor levels could be determined from the training dataset.
      
      Not sure if I am using uids correctly here, would be good to get reviewer help on that.
      cc mengxr
      
      Umbrella design doc: https://docs.google.com/document/d/10NZNSEurN2EdWM31uFYsgayIPfCFHiuIu3pCWrUmP_c/edit#
      
      Author: Eric Liang <ekl@databricks.com>
      
      Closes #7574 from ericl/string-features and squashes the following commits:
      
      f99131a [Eric Liang] comments
      0bf3c26 [Eric Liang] update docs
      c302a2c [Eric Liang] fix tests
      9d1ac82 [Eric Liang] Merge remote-tracking branch 'upstream/master' into string-features
      e713da3 [Eric Liang] comments
      4d79193 [Eric Liang] revert to seq + distinct
      169a085 [Eric Liang] tweak functional test
      a230a47 [Eric Liang] Merge branch 'master' into string-features
      72bd6f3 [Eric Liang] fix merge
      d841cec [Eric Liang] Merge branch 'master' into string-features
      5b2c4a2 [Eric Liang] Mon Jul 20 18:45:33 PDT 2015
      b01c7c5 [Eric Liang] add test
      8a637db [Eric Liang] encoder wip
      a1d03f4 [Eric Liang] refactor into estimator
      8ddfa52c
  6. Jul 24, 2015
  7. Jul 23, 2015
  8. Jul 22, 2015
    • Xiangrui Meng's avatar
      [SPARK-8364] [SPARKR] Add crosstab to SparkR DataFrames · 2f5cbd86
      Xiangrui Meng authored
      Add `crosstab` to SparkR DataFrames, which takes two column names and returns a local R data.frame. This is similar to `table` in R. However, `table` in SparkR is used for loading SQL tables as DataFrames. The return type is data.frame instead table for `crosstab` to be compatible with Scala/Python.
      
      I couldn't run R tests successfully on my local. Many unit tests failed. So let's try Jenkins.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #7318 from mengxr/SPARK-8364 and squashes the following commits:
      
      d75e894 [Xiangrui Meng] fix tests
      53f6ddd [Xiangrui Meng] fix tests
      f1348d6 [Xiangrui Meng] update test
      47cb088 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-8364
      5621262 [Xiangrui Meng] first version without test
      2f5cbd86
  9. Jul 20, 2015
    • Eric Liang's avatar
      [SPARK-9201] [ML] Initial integration of MLlib + SparkR using RFormula · 1cbdd899
      Eric Liang authored
      This exposes the SparkR:::glm() and SparkR:::predict() APIs. It was necessary to change RFormula to silently drop the label column if it was missing from the input dataset, which is kind of a hack but necessary to integrate with the Pipeline API.
      
      The umbrella design doc for MLlib + SparkR integration can be viewed here: https://docs.google.com/document/d/10NZNSEurN2EdWM31uFYsgayIPfCFHiuIu3pCWrUmP_c/edit
      
      mengxr
      
      Author: Eric Liang <ekl@databricks.com>
      
      Closes #7483 from ericl/spark-8774 and squashes the following commits:
      
      3dfac0c [Eric Liang] update
      17ef516 [Eric Liang] more comments
      1753a0f [Eric Liang] make glm generic
      b0f50f8 [Eric Liang] equivalence test
      550d56d [Eric Liang] export methods
      c015697 [Eric Liang] second pass
      117949a [Eric Liang] comments
      5afbc67 [Eric Liang] test label columns
      6b7f15f [Eric Liang] Fri Jul 17 14:20:22 PDT 2015
      3a63ae5 [Eric Liang] Fri Jul 17 13:41:52 PDT 2015
      ce61367 [Eric Liang] Fri Jul 17 13:41:17 PDT 2015
      0299c59 [Eric Liang] Fri Jul 17 13:40:32 PDT 2015
      e37603f [Eric Liang] Fri Jul 17 12:15:03 PDT 2015
      d417d0c [Eric Liang] Merge remote-tracking branch 'upstream/master' into spark-8774
      29a2ce7 [Eric Liang] Merge branch 'spark-8774-1' into spark-8774
      d1959d2 [Eric Liang] clarify comment
      2db68aa [Eric Liang] second round of comments
      dc3c943 [Eric Liang] address comments
      5765ec6 [Eric Liang] fix style checks
      1f361b0 [Eric Liang] doc
      d33211b [Eric Liang] r support
      fb0826b [Eric Liang] [SPARK-8774] Add R model formula with basic support as a transformer
      1cbdd899
    • Yu ISHIKAWA's avatar
      [SPARK-9052] [SPARKR] Fix comments after curly braces · 2bdf9914
      Yu ISHIKAWA authored
      [[SPARK-9052] Fix comments after curly braces - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9052)
      
      This is the full result of lintr at the rivision:01155162.
      [[SPARK-9052] the result of lint-r at the revision:01155162](https://gist.github.com/yu-iskw/e7246041b173a3f29482)
      
      This is the difference of the result between before and after.
      https://gist.github.com/yu-iskw/e7246041b173a3f29482/revisions
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #7440 from yu-iskw/SPARK-9052 and squashes the following commits:
      
      015d738 [Yu ISHIKAWA] Fix the indentations and move the placement of commna
      5cc30fe [Yu ISHIKAWA] Fix the indentation in a condition
      4ead0e5 [Yu ISHIKAWA] [SPARK-9052][SparkR] Fix comments after curly braces
      2bdf9914
  10. Jul 17, 2015
  11. Jul 16, 2015
  12. Jul 15, 2015
    • Liang-Chi Hsieh's avatar
      [SPARK-8840] [SPARKR] Add float coercion on SparkR · 6f690259
      Liang-Chi Hsieh authored
      JIRA: https://issues.apache.org/jira/browse/SPARK-8840
      
      Currently the type coercion rules don't include float type. This PR simply adds it.
      
      Author: Liang-Chi Hsieh <viirya@appier.com>
      
      Closes #7280 from viirya/add_r_float_coercion and squashes the following commits:
      
      c86dc0e [Liang-Chi Hsieh] For comments.
      dbf0c1b [Liang-Chi Hsieh] Implicitly convert Double to Float based on provided schema.
      733015a [Liang-Chi Hsieh] Add test case for DataFrame with float type.
      30c2a40 [Liang-Chi Hsieh] Update test case.
      52b5294 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_r_float_coercion
      6f9159d [Liang-Chi Hsieh] Add another test case.
      8db3244 [Liang-Chi Hsieh] schema also needs to support float. add test case.
      0dcc992 [Liang-Chi Hsieh] Add float coercion on SparkR.
      6f690259
    • Sun Rui's avatar
      [SPARK-8808] [SPARKR] Fix assignments in SparkR. · f650a005
      Sun Rui authored
      Author: Sun Rui <rui.sun@intel.com>
      
      Closes #7395 from sun-rui/SPARK-8808 and squashes the following commits:
      
      ce603bc [Sun Rui] Use '<-' instead of '='.
      88590b1 [Sun Rui] Use '<-' instead of '='.
      f650a005
  13. Jul 13, 2015
    • Sun Rui's avatar
      [SPARK-6797] [SPARKR] Add support for YARN cluster mode. · 7f487c8b
      Sun Rui authored
      This PR enables SparkR to dynamically ship the SparkR binary package to the AM node in YARN cluster mode, thus it is no longer required that the SparkR package be installed on each worker node.
      
      This PR uses the JDK jar tool to package the SparkR package, because jar is thought to be available on both Linux/Windows platforms where JDK has been installed.
      
      This PR does not address the R worker involved in RDD API. Will address it in a separate JIRA issue.
      
      This PR does not address SBT build. SparkR installation and packaging by SBT will be addressed in a separate JIRA issue.
      
      R/install-dev.bat is not tested. shivaram , Could you help to test it?
      
      Author: Sun Rui <rui.sun@intel.com>
      
      Closes #6743 from sun-rui/SPARK-6797 and squashes the following commits:
      
      ca63c86 [Sun Rui] Adjust MimaExcludes after rebase.
      7313374 [Sun Rui] Fix unit test errors.
      72695fb [Sun Rui] Fix unit test failures.
      193882f [Sun Rui] Fix Mima test error.
      fe25a33 [Sun Rui] Fix Mima test error.
      35ecfa3 [Sun Rui] Fix comments.
      c38a005 [Sun Rui] Unzipped SparkR binary package is still required for standalone and Mesos modes.
      b05340c [Sun Rui] Fix scala style.
      2ca5048 [Sun Rui] Fix comments.
      1acefd1 [Sun Rui] Fix scala style.
      0aa1e97 [Sun Rui] Fix scala style.
      41d4f17 [Sun Rui] Add support for locating SparkR package for R workers required by RDD APIs.
      49ff948 [Sun Rui] Invoke jar.exe with full path in install-dev.bat.
      7b916c5 [Sun Rui] Use 'rem' consistently.
      3bed438 [Sun Rui] Add a comment.
      681afb0 [Sun Rui] Fix a bug that RRunner does not handle client deployment modes.
      cedfbe2 [Sun Rui] [SPARK-6797][SPARKR] Add support for YARN cluster mode.
      7f487c8b
  14. Jul 09, 2015
  15. Jul 06, 2015
    • Dirceu Semighini Filho's avatar
      Small update in the readme file · 57c72fcc
      Dirceu Semighini Filho authored
      Just change the attribute from -PsparkR to -Psparkr
      
      Author: Dirceu Semighini Filho <dirceu.semighini@gmail.com>
      
      Closes #7242 from dirceusemighini/patch-1 and squashes the following commits:
      
      fad5991 [Dirceu Semighini Filho] Small update in the readme file
      57c72fcc
  16. Jul 05, 2015
  17. Jul 02, 2015
    • Ilya Ganelin's avatar
      [SPARK-3071] Increase default driver memory · 3697232b
      Ilya Ganelin authored
      I've updated default values in comments, documentation, and in the command line builder to be 1g based on comments in the JIRA. I've also updated most usages to point at a single variable defined in the Utils.scala and JavaUtils.java files. This wasn't possible in all cases (R, shell scripts etc.) but usage in most code is now pointing at the same place.
      
      Please let me know if I've missed anything.
      
      Will the spark-shell use the value within the command line builder during instantiation?
      
      Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
      
      Closes #7132 from ilganeli/SPARK-3071 and squashes the following commits:
      
      4074164 [Ilya Ganelin] String fix
      271610b [Ilya Ganelin] Merge branch 'SPARK-3071' of github.com:ilganeli/spark into SPARK-3071
      273b6e9 [Ilya Ganelin] Test fix
      fd67721 [Ilya Ganelin] Update JavaUtils.java
      26cc177 [Ilya Ganelin] test fix
      e5db35d [Ilya Ganelin] Fixed test failure
      39732a1 [Ilya Ganelin] merge fix
      a6f7deb [Ilya Ganelin] Created default value for DRIVER MEM in Utils that's now used in almost all locations instead of setting manually in each
      09ad698 [Ilya Ganelin] Update SubmitRestProtocolSuite.scala
      19b6f25 [Ilya Ganelin] Missed one doc update
      2698a3d [Ilya Ganelin] Updated default value for driver memory
      3697232b
  18. Jul 01, 2015
    • Sun Rui's avatar
      [SPARK-7714] [SPARKR] SparkR tests should use more specific expectations than expect_true · 69c5dee2
      Sun Rui authored
      1. Update the pattern 'expect_true(a == b)' to 'expect_equal(a, b)'.
      2. Update the pattern 'expect_true(inherits(a, b))' to 'expect_is(a, b)'.
      3. Update the pattern 'expect_true(identical(a, b))' to 'expect_identical(a, b)'.
      
      Author: Sun Rui <rui.sun@intel.com>
      
      Closes #7152 from sun-rui/SPARK-7714 and squashes the following commits:
      
      8ad2440 [Sun Rui] Fix test case errors.
      8fe9f0c [Sun Rui] Update the pattern 'expect_true(identical(a, b))' to 'expect_identical(a, b)'.
      f1b8005 [Sun Rui] Update the pattern 'expect_true(inherits(a, b))' to 'expect_is(a, b)'.
      f631e94 [Sun Rui] Update the pattern 'expect_true(a == b)' to 'expect_equal(a, b)'.
      69c5dee2
  19. Jun 30, 2015
  20. Jun 26, 2015
    • cafreeman's avatar
      [SPARK-8607] SparkR -- jars not being added to application classpath correctly · 9d118177
      cafreeman authored
      Add `getStaticClass` method in SparkR's `RBackendHandler`
      
      This is a fix for the problem referenced in [SPARK-5185](https://issues.apache.org/jira/browse/SPARK-5185
      
      ).
      
      cc shivaram
      
      Author: cafreeman <cfreeman@alteryx.com>
      
      Closes #7001 from cafreeman/branch-1.4 and squashes the following commits:
      
      8f81194 [cafreeman] Add missing license
      31aedcf [cafreeman] Refactor test to call an external R script
      2c22073 [cafreeman] Merge branch 'branch-1.4' of github.com:apache/spark into branch-1.4
      0bea809 [cafreeman] Fixed relative path issue and added smaller JAR
      ee25e60 [cafreeman] Merge branch 'branch-1.4' of github.com:apache/spark into branch-1.4
      9a5c362 [cafreeman] test for including JAR when launching sparkContext
      9101223 [cafreeman] Merge branch 'branch-1.4' of github.com:apache/spark into branch-1.4
      5a80844 [cafreeman] Fix style nits
      7c6bd0c [cafreeman] [SPARK-8607] SparkR
      
      (cherry picked from commit 2579948b)
      Signed-off-by: default avatarShivaram Venkataraman <shivaram@cs.berkeley.edu>
      9d118177
    • cafreeman's avatar
      [SPARK-8662] SparkR Update SparkSQL Test · a56516fc
      cafreeman authored
      Test `infer_type` using a more fine-grained approach rather than comparing environments. Since `all.equal`'s behavior has changed in R 3.2, the test became unpassable.
      
      JIRA here:
      https://issues.apache.org/jira/browse/SPARK-8662
      
      
      
      Author: cafreeman <cfreeman@alteryx.com>
      
      Closes #7045 from cafreeman/R32_Test and squashes the following commits:
      
      b97cc52 [cafreeman] Add `checkStructField` utility
      3381e5c [cafreeman] Update SparkSQL Test
      
      (cherry picked from commit 78b31a2a)
      Signed-off-by: default avatarShivaram Venkataraman <shivaram@cs.berkeley.edu>
      a56516fc
  21. Jun 25, 2015
  22. Jun 24, 2015
    • Holden Karau's avatar
      [SPARK-8506] Add pakages to R context created through init. · 43e66192
      Holden Karau authored
      Author: Holden Karau <holden@pigscanfly.ca>
      
      Closes #6928 from holdenk/SPARK-8506-sparkr-does-not-provide-an-easy-way-to-depend-on-spark-packages-when-performing-init-from-inside-of-r and squashes the following commits:
      
      b60dd63 [Holden Karau] Add an example with the spark-csv package
      fa8bc92 [Holden Karau] typo: sparm -> spark
      865a90c [Holden Karau] strip spaces for comparision
      c7a4471 [Holden Karau] Add some documentation
      c1a9233 [Holden Karau] refactor for testing
      c818556 [Holden Karau] Add pakages to R
      43e66192
  23. Jun 23, 2015
    • Alok  Singh's avatar
      [SPARK-8111] [SPARKR] SparkR shell should display Spark logo and version banner on startup. · f2fb0285
      Alok Singh authored
      spark version is taken from the environment variable SPARK_VERSION
      
      Author: Alok  Singh <singhal@Aloks-MacBook-Pro.local>
      Author: Alok  Singh <singhal@aloks-mbp.usca.ibm.com>
      
      Closes #6944 from aloknsingh/aloknsingh_spark_jiras and squashes the following commits:
      
      ed607bd [Alok  Singh] [SPARK-8111][SparkR] As per suggestion, 1) using the version from sparkContext rather than the Sys.env. 2) change "Welcome to SparkR!" to "Welcome to" followed by Spark logo and version
      acd5b85 [Alok  Singh] fix the jira SPARK-8111 to add the spark version and logo. Currently spark version is taken from the environment variable SPARK_VERSION
      f2fb0285
    • Yu ISHIKAWA's avatar
      [SPARK-8431] [SPARKR] Add in operator to DataFrame Column in SparkR · d4f63351
      Yu ISHIKAWA authored
      [[SPARK-8431] Add in operator to DataFrame Column in SparkR - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-8431)
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #6941 from yu-iskw/SPARK-8431 and squashes the following commits:
      
      1f64423 [Yu ISHIKAWA] Modify the comment
      f4309a7 [Yu ISHIKAWA] Make a `setMethod` for `%in%` be independent
      6e37936 [Yu ISHIKAWA] Modify a variable name
      c196173 [Yu ISHIKAWA] [SPARK-8431][SparkR] Add in operator to DataFrame Column in SparkR
      d4f63351
  24. Jun 22, 2015
  25. Jun 20, 2015
    • Yu ISHIKAWA's avatar
      [SPARK-8495] [SPARKR] Add a `.lintr` file to validate the SparkR files and the `lint-r` script · 004f5737
      Yu ISHIKAWA authored
      Thank Shivaram Venkataraman for your support. This is a prototype script to validate the R files.
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #6922 from yu-iskw/SPARK-6813 and squashes the following commits:
      
      c1ffe6b [Yu ISHIKAWA] Modify to save result to a log file and add a rule to validate
      5520806 [Yu ISHIKAWA] Exclude the .lintr file not to check Apache lincence
      8f94680 [Yu ISHIKAWA] [SPARK-8495][SparkR] Add a `.lintr` file to validate the SparkR files and the `lint-r` script
      004f5737
  26. Jun 19, 2015
    • Hossein's avatar
      [SPARK-8452] [SPARKR] expose jobGroup API in SparkR · 1fa29c2d
      Hossein authored
      This pull request adds following methods to SparkR:
      
      ```R
      setJobGroup()
      cancelJobGroup()
      clearJobGroup()
      ```
      For each method, the spark context is passed as the first argument. There does not seem to be a good way to test these in R.
      
      cc shivaram and davies
      
      Author: Hossein <hossein@databricks.com>
      
      Closes #6889 from falaki/SPARK-8452 and squashes the following commits:
      
      9ce9f1e [Hossein] Added basic tests to verify methods can be called and won't throw errors
      c706af9 [Hossein] Added examples
      a2c19af [Hossein] taking spark context as first argument
      343ca77 [Hossein] Added setJobGroup, cancelJobGroup and clearJobGroup to SparkR
      1fa29c2d
  27. Jun 15, 2015
    • andrewor14's avatar
      [SPARK-8350] [R] Log R unit test output to "unit-tests.log" · 56d4e8a2
      andrewor14 authored
      Right now it's logged to "R-unit-tests.log". Jenkins currently only archives files named "unit-tests.log", and this is what all other modules (e.g. SQL, network, REPL) use.
      1. We should be consistent
      2. I don't want to reconfigure Jenkins to accept a different file
      
      shivaram
      
      Author: andrewor14 <andrew@databricks.com>
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6807 from andrewor14/r-logs and squashes the following commits:
      
      96005d2 [andrewor14] Nest unit-tests.log further until R
      407c46c [andrewor14] Add target to log path
      d7b68ae [Andrew Or] Log R unit test output to "unit-tests.log"
      56d4e8a2
  28. Jun 08, 2015
Loading