Skip to content
Snippets Groups Projects
  1. Jul 25, 2014
    • Cheng Lian's avatar
      [SPARK-2410][SQL] Merging Hive Thrift/JDBC server · 06dc0d2c
      Cheng Lian authored
      JIRA issue:
      
      - Main: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410)
      - Related: [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678)
      
      Cherry picked the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc).
      
      (Thanks chenghao-intel for his initial contribution of the Spark SQL CLI.)
      
      TODO
      
      - [x] Use `spark-submit` to launch the server, the CLI and beeline
      - [x] Migration guideline draft for Shark users
      
      ----
      
      Hit by a bug in `SparkSubmitArguments` while working on this PR: all application options that are recognized by `SparkSubmitArguments` are stolen as `SparkSubmit` options. For example:
      
      ```bash
      $ spark-submit --class org.apache.hive.beeline.BeeLine spark-internal --help
      ```
      
      This actually shows usage information of `SparkSubmit` rather than `BeeLine`.
      
      ~~Fixed this bug here since the `spark-internal` related stuff also touches `SparkSubmitArguments` and I'd like to avoid conflict.~~
      
      **UPDATE** The bug mentioned above is now tracked by [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678). Decided to revert changes to this bug since it involves more subtle considerations and worth a separate PR.
      
      Author: Cheng Lian <lian.cs.zju@gmail.com>
      
      Closes #1399 from liancheng/thriftserver and squashes the following commits:
      
      090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR
      21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs
      fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd]
      199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver
      1083e9d [Cheng Lian] Fixed failed test suites
      7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic
      9cc0f06 [Cheng Lian] Starts beeline with spark-submit
      cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile
      061880f [Cheng Lian] Addressed all comments by @pwendell
      7755062 [Cheng Lian] Adapts test suites to spark-submit settings
      40bafef [Cheng Lian] Fixed more license header issues
      e214aab [Cheng Lian] Added missing license headers
      b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh
      f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft
      3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit
      a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit
      61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit
      2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
      06dc0d2c
  2. Jul 17, 2014
    • Patrick Wendell's avatar
      SPARK-2526: Simplify options in make-distribution.sh · d0ea4968
      Patrick Wendell authored
      Right now we have a bunch of parallel logic in make-distribution.sh
      that's just extra work to maintain. We should just pass through
      Maven profiles in this case and keep the script simple. See
      the JIRA for more details.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #1445 from pwendell/make-distribution.sh and squashes the following commits:
      
      f1294ea [Patrick Wendell] Simplify options in make-distribution.sh.
      d0ea4968
  3. Jul 04, 2014
  4. May 15, 2014
  5. May 12, 2014
  6. Apr 29, 2014
  7. Apr 28, 2014
  8. Apr 24, 2014
  9. Apr 23, 2014
    • Patrick Wendell's avatar
      SPARK-1119 and other build improvements · cd4ed293
      Patrick Wendell authored
      1. Makes assembly and examples jar naming consistent in maven/sbt.
      2. Updates make-distribution.sh to use Maven and fixes some bugs.
      3. Updates the create-release script to call make-distribution script.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #502 from pwendell/make-distribution and squashes the following commits:
      
      1a97f0d [Patrick Wendell] SPARK-1119 and other build improvements
      cd4ed293
  10. Apr 21, 2014
  11. Apr 06, 2014
    • Aaron Davidson's avatar
      SPARK-1314: Use SPARK_HIVE to determine if we include Hive in packaging · 41065584
      Aaron Davidson authored
      Previously, we based our decision regarding including datanucleus jars based on the existence of a spark-hive-assembly jar, which was incidentally built whenever "sbt assembly" is run. This means that a typical and previously supported pathway would start using hive jars.
      
      This patch has the following features/bug fixes:
      
      - Use of SPARK_HIVE (default false) to determine if we should include Hive in the assembly jar.
      - Analagous feature in Maven with -Phive (previously, there was no support for adding Hive to any of our jars produced by Maven)
      - assemble-deps fixed since we no longer use a different ASSEMBLY_DIR
      - avoid adding log message in compute-classpath.sh to the classpath :)
      
      Still TODO before mergeable:
      - We need to download the datanucleus jars outside of sbt. Perhaps we can have spark-class download them if SPARK_HIVE is set similar to how sbt downloads itself.
      - Spark SQL documentation updates.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #237 from aarondav/master and squashes the following commits:
      
      5dc4329 [Aaron Davidson] Typo fixes
      dd4f298 [Aaron Davidson] Doc update
      dd1a365 [Aaron Davidson] Eliminate need for SPARK_HIVE at runtime by d/ling datanucleus from Maven
      a9269b5 [Aaron Davidson] [WIP] Use SPARK_HIVE to determine if we include Hive in packaging
      41065584
  12. Mar 11, 2014
    • Patrick Wendell's avatar
      SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues... · 16788a65
      Patrick Wendell authored
      This patch removes Ganglia integration from the default build. It
      allows users willing to link against LGPL code to use Ganglia
      by adding build flags or linking against a new Spark artifact called
      spark-ganglia-lgpl.
      
      This brings Spark in line with the Apache policy on LGPL code
      enumerated here:
      
      https://www.apache.org/legal/3party.html#options-optional
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #108 from pwendell/ganglia and squashes the following commits:
      
      326712a [Patrick Wendell] Responding to review feedback
      5f28ee4 [Patrick Wendell] SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues.
      16788a65
  13. Mar 02, 2014
  14. Mar 01, 2014
  15. Feb 09, 2014
    • Patrick Wendell's avatar
      Merge pull request #565 from pwendell/dev-scripts. Closes #565. · f892da87
      Patrick Wendell authored
      SPARK-1066: Add developer scripts to repository.
      
      These are some developer scripts I've been maintaining in a separate public repo. This patch adds them to the Spark repository so they can evolve here and are clearly accessible to all committers.
      
      I may do some small additional clean-up in this PR, but wanted to put them here in case others want to review. There are a few types of scripts here:
      
      1. A tool to merge pull requests.
      2. A script for packaging releases.
      3. A script for auditing release candidates.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      == Merge branch commits ==
      
      commit 5d5d331d01f6fd59c2eb830f652955119b012173
      Author: Patrick Wendell <pwendell@gmail.com>
      Date:   Sat Feb 8 22:11:47 2014 -0800
      
          SPARK-1066: Add developer scripts to repository.
      f892da87
Loading