Skip to content
Snippets Groups Projects
  1. Jul 22, 2014
    • Gera Shegalov's avatar
      [YARN] SPARK-2577: File upload to viewfs is broken due to mount point re... · 02e45729
      Gera Shegalov authored
      Opting to the option 2 defined in SPARK-2577, i.e., retrieve and pass the correct file system object to addResource.
      
      Author: Gera Shegalov <gera@twitter.com>
      
      Closes #1483 from gerashegalov/master and squashes the following commits:
      
      90c9087 [Gera Shegalov] [YARN] SPARK-2577: File upload to viewfs is broken due to mount point resolution
      02e45729
  2. Jul 21, 2014
    • Sandy Ryza's avatar
      SPARK-1707. Remove unnecessary 3 second sleep in YarnClusterScheduler · f89cf65d
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #634 from sryza/sandy-spark-1707 and squashes the following commits:
      
      2f6e358 [Sandy Ryza] Default min registered executors ratio to .8 for YARN
      354c630 [Sandy Ryza] Remove outdated comments
      c744ef3 [Sandy Ryza] Take out waitForInitialAllocations
      2a4329b [Sandy Ryza] SPARK-1707. Remove unnecessary 3 second sleep in YarnClusterScheduler
      f89cf65d
  3. Jul 15, 2014
    • witgo's avatar
      SPARK-1291: Link the spark UI to RM ui in yarn-client mode · 72ea56da
      witgo authored
      Author: witgo <witgo@qq.com>
      
      Closes #1112 from witgo/SPARK-1291 and squashes the following commits:
      
      6022bcd [witgo] review commit
      1fbb925 [witgo] add addAmIpFilter to yarn alpha
      210299c [witgo] review commit
      1b92a07 [witgo] review commit
      6896586 [witgo] Add comments to addWebUIFilter
      3e9630b [witgo] review commit
      142ee29 [witgo] review commit
      1fe7710 [witgo] Link the spark UI to RM ui in yarn-client mode
      72ea56da
  4. Jul 14, 2014
    • li-zhihui's avatar
      [SPARK-1946] Submit tasks after (configured ratio) executors have been registered · 3dd8af7a
      li-zhihui authored
      Because submitting tasks and registering executors are asynchronous, in most situation, early stages' tasks run without preferred locality.
      
      A simple solution is sleeping few seconds in application, so that executors have enough time to register.
      
      The PR add 2 configuration properties to make TaskScheduler submit tasks after a few of executors have been registered.
      
      \# Submit tasks only after (registered executors / total executors) arrived the ratio, default value is 0
      spark.scheduler.minRegisteredExecutorsRatio = 0.8
      
      \# Whatever minRegisteredExecutorsRatio is arrived, submit tasks after the maxRegisteredWaitingTime(millisecond), default value is 30000
      spark.scheduler.maxRegisteredExecutorsWaitingTime = 5000
      
      Author: li-zhihui <zhihui.li@intel.com>
      
      Closes #900 from li-zhihui/master and squashes the following commits:
      
      b9f8326 [li-zhihui] Add logs & edit docs
      1ac08b1 [li-zhihui] Add new configs to user docs
      22ead12 [li-zhihui] Move waitBackendReady to postStartHook
      c6f0522 [li-zhihui] Bug fix: numExecutors wasn't set & use constant DEFAULT_NUMBER_EXECUTORS
      4d6d847 [li-zhihui] Move waitBackendReady to TaskSchedulerImpl.start & some code refactor
      0ecee9a [li-zhihui] Move waitBackendReady from DAGScheduler.submitStage to TaskSchedulerImpl.submitTasks
      4261454 [li-zhihui] Add docs for new configs & code style
      ce0868a [li-zhihui] Code style, rename configuration property name of minRegisteredRatio & maxRegisteredWaitingTime
      6cfb9ec [li-zhihui] Code style, revert default minRegisteredRatio of yarn to 0, driver get --num-executors in yarn/alpha
      812c33c [li-zhihui] Fix driver lost --num-executors option in yarn-cluster mode
      e7b6272 [li-zhihui] support yarn-cluster
      37f7dc2 [li-zhihui] support yarn mode(percentage style)
      3f8c941 [li-zhihui] submit stage after (configured ratio of) executors have been registered
      3dd8af7a
  5. Jul 10, 2014
    • Prashant Sharma's avatar
      [SPARK-1776] Have Spark's SBT build read dependencies from Maven. · 628932b8
      Prashant Sharma authored
      Patch introduces the new way of working also retaining the existing ways of doing things.
      
      For example build instruction for yarn in maven is
      `mvn -Pyarn -PHadoop2.2 clean package -DskipTests`
      in sbt it can become
      `MAVEN_PROFILES="yarn, hadoop-2.2" sbt/sbt clean assembly`
      Also supports
      `sbt/sbt -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 clean assembly`
      
      Author: Prashant Sharma <prashant.s@imaginea.com>
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #772 from ScrapCodes/sbt-maven and squashes the following commits:
      
      a8ac951 [Prashant Sharma] Updated sbt version.
      62b09bb [Prashant Sharma] Improvements.
      fa6221d [Prashant Sharma] Excluding sql from mima
      4b8875e [Prashant Sharma] Sbt assembly no longer builds tools by default.
      72651ca [Prashant Sharma] Addresses code reivew comments.
      acab73d [Prashant Sharma] Revert "Small fix to run-examples script."
      ac4312c [Prashant Sharma] Revert "minor fix"
      6af91ac [Prashant Sharma] Ported oldDeps back. + fixes issues with prev commit.
      65cf06c [Prashant Sharma] Servelet API jars mess up with the other servlet jars on the class path.
      446768e [Prashant Sharma] minor fix
      89b9777 [Prashant Sharma] Merge conflicts
      d0a02f2 [Prashant Sharma] Bumped up pom versions, Since the build now depends on pom it is better updated there. + general cleanups.
      dccc8ac [Prashant Sharma] updated mima to check against 1.0
      a49c61b [Prashant Sharma] Fix for tools jar
      a2f5ae1 [Prashant Sharma] Fixes a bug in dependencies.
      cf88758 [Prashant Sharma] cleanup
      9439ea3 [Prashant Sharma] Small fix to run-examples script.
      96cea1f [Prashant Sharma] SPARK-1776 Have Spark's SBT build read dependencies from Maven.
      36efa62 [Patrick Wendell] Set project name in pom files and added eclipse/intellij plugins.
      4973dbd [Patrick Wendell] Example build using pom reader.
      628932b8
  6. Jun 30, 2014
    • Reynold Xin's avatar
      [SPARK-2318] When exiting on a signal, print the signal name first. · 5fccb567
      Reynold Xin authored
      Author: Reynold Xin <rxin@apache.org>
      
      Closes #1260 from rxin/signalhandler1 and squashes the following commits:
      
      8e73552 [Reynold Xin] Uh add Logging back in ApplicationMaster.
      0402ba8 [Reynold Xin] Synchronize SignalLogger.register.
      dc70705 [Reynold Xin] Added SignalLogger to YARN ApplicationMaster.
      79a21b4 [Reynold Xin] Added license header.
      0da052c [Reynold Xin] Added the SignalLogger itself.
      e587d2e [Reynold Xin] [SPARK-2318] When exiting on a signal, print the signal name first.
      5fccb567
  7. Jun 26, 2014
    • Kay Ousterhout's avatar
      Remove use of spark.worker.instances · 48a82a82
      Kay Ousterhout authored
      spark.worker.instances was added as part of this commit: https://github.com/apache/spark/commit/1617816090e7b20124a512a43860a21232ebf511
      
      My understanding is that SPARK_WORKER_INSTANCES is supported for backwards compatibility,
      but spark.worker.instances is never used (SparkSubmit.scala sets spark.executor.instances) so should
      not have been added.
      
      @sryza @pwendell @tgravescs LMK if I'm understanding this correctly
      
      Author: Kay Ousterhout <kayousterhout@gmail.com>
      
      Closes #1214 from kayousterhout/yarn_config and squashes the following commits:
      
      3d7c491 [Kay Ousterhout] Remove use of spark.worker.instances
      48a82a82
  8. Jun 23, 2014
    • Marcelo Vanzin's avatar
      [SPARK-1395] Fix "local:" URI support in Yarn mode (again). · e380767d
      Marcelo Vanzin authored
      Recent changes ignored the fact that path may be defined with "local:"
      URIs, which means they need to be explicitly added to the classpath
      everywhere a remote process is started. This change fixes that by:
      
      - Using the correct methods to add paths to the classpath
      - Creating SparkConf settings for the Spark jar itself and for the
        user's jar
      - Propagating those two settings to the remote processes where needed
      
      This ensures that both in client and in cluster mode, the driver has
      the necessary info to build the executor's classpath and have things
      still work when they contain "local:" references.
      
      The change also fixes some confusion in ClientBase about whether
      to use SparkConf or system properties to propagate config options to
      the driver and executors, by standardizing on using data held by
      SparkConf.
      
      On the cleanup front, I removed the hacky way that log4j configuration
      was being propagated to handle the "local:" case. It's much more cleanly
      (and generically) handled by using spark-submit arguments (--files to
      upload a config file, or setting spark.executor.extraJavaOptions to pass
      JVM arguments and use a local file).
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #560 from vanzin/yarn-local-2 and squashes the following commits:
      
      4e7f066 [Marcelo Vanzin] Correctly propagate SPARK_JAVA_OPTS to driver/executor.
      6a454ea [Marcelo Vanzin] Use constants for PWD in test.
      6dd5943 [Marcelo Vanzin] Fix propagation of config options to driver / executor.
      b2e377f [Marcelo Vanzin] Review feedback.
      93c3f85 [Marcelo Vanzin] Fix ClassCastException in test.
      e5c682d [Marcelo Vanzin] Fix cluster mode, restore SPARK_LOG4J_CONF.
      1dfbb40 [Marcelo Vanzin] Add documentation for spark.yarn.jar.
      bbdce05 [Marcelo Vanzin] [SPARK-1395] Fix "local:" URI support in Yarn mode (again).
      e380767d
  9. Jun 19, 2014
    • witgo's avatar
      [SPARK-2051]In yarn.ClientBase spark.yarn.dist.* do not work · bce0897b
      witgo authored
      Author: witgo <witgo@qq.com>
      
      Closes #969 from witgo/yarn_ClientBase and squashes the following commits:
      
      8117765 [witgo] review commit
      3bdbc52 [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase
      5261b6c [witgo] fix sys.props.get("SPARK_YARN_DIST_FILES")
      e3c1107 [witgo] update docs
      b6a9aa1 [witgo] merge master
      c8b4554 [witgo] review commit
      2f48789 [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase
      8d7b82f [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase
      1048549 [witgo] remove Utils.resolveURIs
      871f1db [witgo] add spark.yarn.dist.* documentation
      41bce59 [witgo] review commit
      35d6fa0 [witgo] move to ClientArguments
      55d72fc [witgo] Merge branch 'master' of https://github.com/apache/spark into yarn_ClientBase
      9cdff16 [witgo] review commit
      8bc2f4b [witgo] review commit
      20e667c [witgo] Merge branch 'master' into yarn_ClientBase
      0961151 [witgo] merge master
      ce609fc [witgo] Merge branch 'master' into yarn_ClientBase
      8362489 [witgo] yarn.ClientBase spark.yarn.dist.* do not work
      bce0897b
  10. Jun 16, 2014
    • witgo's avatar
      [SPARK-1930] The Container is running beyond physical memory limits, so as to be killed · cdf2b045
      witgo authored
      Author: witgo <witgo@qq.com>
      
      Closes #894 from witgo/SPARK-1930 and squashes the following commits:
      
      564307e [witgo] Update the running-on-yarn.md
      3747515 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
      172647b [witgo] add memoryOverhead docs
      a0ff545 [witgo] leaving only two configs
      a17bda2 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
      478ca15 [witgo] Merge branch 'master' into SPARK-1930
      d1244a1 [witgo] Merge branch 'master' into SPARK-1930
      8b967ae [witgo] Merge branch 'master' into SPARK-1930
      655a820 [witgo] review commit
      71859a7 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1930
      e3c531d [witgo] review commit
      e16f190 [witgo] different memoryOverhead
      ffa7569 [witgo] review commit
      5c9581f [witgo] Merge branch 'master' into SPARK-1930
      9a6bcf2 [witgo] review commit
      8fae45a [witgo] fix NullPointerException
      e0dcc16 [witgo] Adding  configuration items
      b6a989c [witgo] Fix container memory beyond limit, were killed
      cdf2b045
  11. Jun 12, 2014
    • John Zhao's avatar
      [SPARK-1516]Throw exception in yarn client instead of run system.exit directly. · f95ac686
      John Zhao authored
      All the changes is in  the package of "org.apache.spark.deploy.yarn":
          1) Throw exception in ClinetArguments and ClientBase instead of exit directly.
          2) in Client's main method, if exception is caught, it will exit with code 1, otherwise exit with code 0.
      
      After the fix, if user integrate the spark yarn client into their applications, when the argument is wrong or the running is finished, the application won't be terminated.
      
      Author: John Zhao <jzhao@alpinenow.com>
      
      Closes #490 from codeboyyong/jira_1516_systemexit_inyarnclient and squashes the following commits:
      
      138cb48 [John Zhao] [SPARK-1516]Throw exception in yarn clinet instead of run system.exit directly. All the changes is in  the package of "org.apache.spark.deploy.yarn": 1) Add a ClientException with an exitCode 2) Throws exception in ClinetArguments and ClientBase instead of exit directly 3) in Client's main method, catch exception and exit with the exitCode.
      f95ac686
    • Marcelo Vanzin's avatar
      [SPARK-2080] Yarn: report HS URL in client mode, correct user in cluster mode. · ecde5b83
      Marcelo Vanzin authored
      Yarn client mode was not setting the app's tracking URL to the
      History Server's URL when configured by the user. Now client mode
      behaves the same as cluster mode.
      
      In SparkContext.scala, the "user.name" system property had precedence
      over the SPARK_USER environment variable. This means that SPARK_USER
      was never used, since "user.name" is always set by the JVM. In Yarn
      cluster mode, this means the application always reported itself as
      being run by user "yarn" (or whatever user was running the Yarn NM).
      One could argue that the correct fix would be to use UGI.getCurrentUser()
      here, but at least for Yarn that will match what SPARK_USER is set
      to.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Thomas Graves <tgraves@apache.org>
      
      Closes #1002 from vanzin/yarn-client-url and squashes the following commits:
      
      4046e04 [Marcelo Vanzin] Set HS link in yarn-alpha also.
      4c692d9 [Marcelo Vanzin] Yarn: report HS URL in client mode, correct user in cluster mode.
      ecde5b83
  12. Jun 11, 2014
    • Sandy Ryza's avatar
      SPARK-1639. Tidy up some Spark on YARN code · 2a4225dd
      Sandy Ryza authored
      This contains a bunch of small tidyings of the Spark on YARN code.
      
      I focused on the yarn stable code.  @tgravescs, let me know if you'd like me to make these for the alpha code as well.
      
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #561 from sryza/sandy-spark-1639 and squashes the following commits:
      
      72b6a02 [Sandy Ryza] Fix comment and set name on driver thread
      c2190b2 [Sandy Ryza] SPARK-1639. Tidy up some Spark on YARN code
      2a4225dd
  13. Jun 10, 2014
  14. Jun 09, 2014
    • Bernardo Gomez Palacio's avatar
      [SPARK-1522] : YARN ClientBase throws a NPE if there is no YARN Application CP · e2734476
      Bernardo Gomez Palacio authored
      The current implementation of ClientBase.getDefaultYarnApplicationClasspath inspects
      the MRJobConfig class for the field DEFAULT_YARN_APPLICATION_CLASSPATH when it should
      be really looking into YarnConfiguration. If the Application Configuration has no
      yarn.application.classpath defined a NPE exception will be thrown.
      
      Additional Changes include:
      * Test Suite for ClientBase added
      
      [ticket: SPARK-1522] : https://issues.apache.org/jira/browse/SPARK-1522
      
      Author      : bernardo.gomezpalacio@gmail.com
      Testing     : SPARK_HADOOP_VERSION=2.3.0 SPARK_YARN=true ./sbt/sbt test
      
      Author: Bernardo Gomez Palacio <bernardo.gomezpalacio@gmail.com>
      
      Closes #433 from berngp/feature/SPARK-1522 and squashes the following commits:
      
      2c2e118 [Bernardo Gomez Palacio] [SPARK-1522]: YARN ClientBase throws a NPE if there is no YARN Application specific CP
      e2734476
  15. Jun 08, 2014
  16. Jun 05, 2014
  17. May 31, 2014
  18. May 22, 2014
    • Xiangrui Meng's avatar
      [SPARK-1870] Make spark-submit --jars work in yarn-cluster mode. · dba31402
      Xiangrui Meng authored
      Sent secondary jars to distributed cache of all containers and add the cached jars to classpath before executors start. Tested on a YARN cluster (CDH-5.0).
      
      `spark-submit --jars` also works in standalone server and `yarn-client`. Thanks for @andrewor14 for testing!
      
      I removed "Doesn't work for drivers in standalone mode with "cluster" deploy mode." from `spark-submit`'s help message, though we haven't tested mesos yet.
      
      CC: @dbtsai @sryza
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #848 from mengxr/yarn-classpath and squashes the following commits:
      
      23e7df4 [Xiangrui Meng] rename spark.jar to __spark__.jar and app.jar to __app__.jar to avoid confliction apped $CWD/ and $CWD/* to the classpath remove unused methods
      a40f6ed [Xiangrui Meng] standalone -> cluster
      65e04ad [Xiangrui Meng] update spark-submit help message and add a comment for yarn-client
      11e5354 [Xiangrui Meng] minor changes
      3e7e1c4 [Xiangrui Meng] use sparkConf instead of hadoop conf
      dc3c825 [Xiangrui Meng] add secondary jars to classpath in yarn
      dba31402
  19. May 21, 2014
    • Andrew Or's avatar
      [Typo] Stoped -> Stopped · ba5d4a99
      Andrew Or authored
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #847 from andrewor14/yarn-typo and squashes the following commits:
      
      c1906af [Andrew Or] Stoped -> Stopped
      ba5d4a99
  20. May 10, 2014
    • Andrew Or's avatar
      [SPARK-1774] Respect SparkSubmit --jars on YARN (client) · 83e0424d
      Andrew Or authored
      SparkSubmit ignores `--jars` for YARN client. This is a bug.
      
      This PR also automatically adds the application jar to `spark.jar`. Previously, when running as yarn-client, you must specify the jar additionally through `--files` (because `--jars` didn't work). Now you don't have to explicitly specify it through either.
      
      Tested on a YARN cluster.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #710 from andrewor14/yarn-jars and squashes the following commits:
      
      35d1928 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-jars
      c27bf6c [Andrew Or] For yarn-cluster and python, do not add primaryResource to spark.jar
      c92c5bf [Andrew Or] Minor cleanups
      269f9f3 [Andrew Or] Fix format
      013d840 [Andrew Or] Fix tests
      1407474 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-jars
      3bb75e8 [Andrew Or] Allow SparkSubmit --jars to take effect in yarn-client mode
      83e0424d
  21. May 08, 2014
  22. May 07, 2014
    • Thomas Graves's avatar
      SPARK-1569 Spark on Yarn, authentication broken by pr299 · 4bec84b6
      Thomas Graves authored
      Pass the configs as java options since the executor needs to know before it registers whether to create the connection using authentication or not.    We could see about passing only the authentication configs but for now I just had it pass them all.
      
      I also updating it to use a list to construct the command to make it the same as ClientBase and avoid any issues with spaces.
      
      Author: Thomas Graves <tgraves@apache.org>
      
      Closes #649 from tgravescs/SPARK-1569 and squashes the following commits:
      
      0178ab8 [Thomas Graves] add akka settings
      22a8735 [Thomas Graves] Change to only path spark.auth* configs
      8ccc1d4 [Thomas Graves] SPARK-1569 Spark on Yarn, authentication broken
      4bec84b6
  23. May 06, 2014
    • Thomas Graves's avatar
      SPARK-1474: Spark on yarn assembly doesn't include AmIpFilter · 1e829905
      Thomas Graves authored
      We use org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter in spark on yarn but are not included it in the assembly jar.
      
      I tested this on yarn cluster by removing the yarn jars from the classpath and spark runs fine now.
      
      Author: Thomas Graves <tgraves@apache.org>
      
      Closes #406 from tgravescs/SPARK-1474 and squashes the following commits:
      
      1548bf9 [Thomas Graves] SPARK-1474: Spark on yarn assembly doesn't include org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
      1e829905
  24. May 04, 2014
    • witgo's avatar
      The default version of yarn is equal to the hadoop version · fb054322
      witgo authored
      This is a part of [PR 590](https://github.com/apache/spark/pull/590)
      
      Author: witgo <witgo@qq.com>
      
      Closes #626 from witgo/yarn_version and squashes the following commits:
      
      c390631 [witgo] restore  the yarn dependency declarations
      f8a4ad8 [witgo] revert remove the dependency of avro in yarn-alpha
      2df6cf5 [witgo] review commit
      a1d876a [witgo] review commit
      20e7e3e [witgo] review commit
      c76763b [witgo] The default value of yarn.version is equal to hadoop.version
      fb054322
  25. May 03, 2014
    • Thomas Graves's avatar
      [WIP] SPARK-1676: Cache Hadoop UGIs by default to prevent FileSystem leak · 3d0a02df
      Thomas Graves authored
      Move the doAs in Executor higher up so that we only have 1 ugi and aren't leaking filesystems.
      Fix spark on yarn to work when the cluster is running as user "yarn" but the clients are launched as the user and want to read/write to hdfs as the user.
      
      Note this hasn't been fully tested yet.  Need to test in standalone mode.
      
      Putting this up for people to look at and possibly test.  I don't have access to a mesos cluster.
      
      This is alternative to https://github.com/apache/spark/pull/607
      
      Author: Thomas Graves <tgraves@apache.org>
      
      Closes #621 from tgravescs/SPARK-1676 and squashes the following commits:
      
      244d55a [Thomas Graves] fix line length
      44163d4 [Thomas Graves] Rework
      9398853 [Thomas Graves] change to have doAs in executor higher up.
      3d0a02df
  26. Apr 29, 2014
    • Sandy Ryza's avatar
      SPARK-1588. Restore SPARK_YARN_USER_ENV and SPARK_JAVA_OPTS for YARN. · bf8d0aa2
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #586 from sryza/sandy-spark-1588 and squashes the following commits:
      
      35eb38e [Sandy Ryza] Scalify
      b361684 [Sandy Ryza] SPARK-1588.  Restore SPARK_YARN_USER_ENV and SPARK_JAVA_OPTS for YARN.
      bf8d0aa2
    • witgo's avatar
      Improved build configuration · 030f2c21
      witgo authored
      1, Fix SPARK-1441: compile spark core error with hadoop 0.23.x
      2, Fix SPARK-1491: maven hadoop-provided profile fails to build
      3, Fix org.scala-lang: * ,org.apache.avro:* inconsistent versions dependency
      4, A modified on the sql/catalyst/pom.xml,sql/hive/pom.xml,sql/core/pom.xml (Four spaces formatted into two spaces)
      
      Author: witgo <witgo@qq.com>
      
      Closes #480 from witgo/format_pom and squashes the following commits:
      
      03f652f [witgo] review commit
      b452680 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      bee920d [witgo] revert fix SPARK-1629: Spark Core missing commons-lang dependence
      7382a07 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      6902c91 [witgo] fix SPARK-1629: Spark Core missing commons-lang dependence
      0da4bc3 [witgo] merge master
      d1718ed [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      e345919 [witgo] add avro dependency to yarn-alpha
      77fad08 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      62d0862 [witgo] Fix org.scala-lang: * inconsistent versions dependency
      1a162d7 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      934f24d [witgo] review commit
      cf46edc [witgo] exclude jruby
      06e7328 [witgo] Merge branch 'SparkBuild' into format_pom
      99464d2 [witgo] fix maven hadoop-provided profile fails to build
      0c6c1fc [witgo] Fix compile spark core error with hadoop 0.23.x
      6851bec [witgo] Maintain consistent SparkBuild.scala, pom.xml
      030f2c21
  27. Apr 28, 2014
    • Patrick Wendell's avatar
      SPARK-1652: Remove incorrect deprecation warning in spark-submit · 9f7a0951
      Patrick Wendell authored
      This is a straightforward fix.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Patrick Wendell <pwendell@gmail.com>
      
      Closes #578 from pwendell/spark-submit-yarn and squashes the following commits:
      
      96027c7 [Patrick Wendell] Test fixes
      b5be173 [Patrick Wendell] Review feedback
      4ac9cac [Patrick Wendell] SPARK-1652: spark-submit for yarn prints warnings even though calling as expected
      9f7a0951
  28. Apr 25, 2014
    • Sean Owen's avatar
      SPARK-1607. HOTFIX: Fix syntax adapting Int result to Short · df6d8142
      Sean Owen authored
      Sorry folks. This should make the change for SPARK-1607 compile again. Verified this time with the yarn build enabled.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #556 from srowen/SPARK-1607.2 and squashes the following commits:
      
      e3fe7a3 [Sean Owen] Fix syntax adapting Int result to Short
      df6d8142
    • Sean Owen's avatar
      SPARK-1607. Replace octal literals, removed in Scala 2.11, with hex literals · 6e101f11
      Sean Owen authored
      Octal literals like "0700" are deprecated in Scala 2.10, generating a warning. They have been removed entirely in 2.11. See https://issues.scala-lang.org/browse/SI-7618
      
      This change simply replaces two uses of octals with hex literals, which seemed the next-best representation since they express a bit mask (file permission in particular)
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #529 from srowen/SPARK-1607 and squashes the following commits:
      
      1ee0e67 [Sean Owen] Use Integer.parseInt(...,8) for octal literal instead of hex equivalent
      0102f3d [Sean Owen] Replace octal literals, removed in Scala 2.11, with hex literals
      6e101f11
  29. Apr 24, 2014
    • Mridul Muralidharan's avatar
      SPARK-1586 Windows build fixes · 968c0187
      Mridul Muralidharan authored
      Unfortunately, this is not exhaustive - particularly hive tests still fail due to path issues.
      
      Author: Mridul Muralidharan <mridulm80@apache.org>
      
      This patch had conflicts when merged, resolved by
      Committer: Matei Zaharia <matei@databricks.com>
      
      Closes #505 from mridulm/windows_fixes and squashes the following commits:
      
      ef12283 [Mridul Muralidharan] Move to org.apache.commons.lang3 for StringEscapeUtils. Earlier version was buggy appparently
      cdae406 [Mridul Muralidharan] Remove leaked changes from > 2G fix branch
      3267f4b [Mridul Muralidharan] Fix build failures
      35b277a [Mridul Muralidharan] Fix Scalastyle failures
      bc69d14 [Mridul Muralidharan] Change from hardcoded path separator
      10c4d78 [Mridul Muralidharan] Use explicit encoding while using getBytes
      1337abd [Mridul Muralidharan] fix classpath while running in windows
      968c0187
    • Sandeep's avatar
      Fix Scala Style · a03ac222
      Sandeep authored
      Any comments are welcome
      
      Author: Sandeep <sandeep@techaddict.me>
      
      Closes #531 from techaddict/stylefix-1 and squashes the following commits:
      
      7492730 [Sandeep] Pass 4
      98b2428 [Sandeep] fix rxin suggestions
      b5e2e6f [Sandeep] Pass 3
      05932d7 [Sandeep] fix if else styling 2
      08690e5 [Sandeep] fix if else styling
      a03ac222
  30. Apr 22, 2014
    • Patrick Wendell's avatar
      Assorted clean-up for Spark-on-YARN. · 995fdc96
      Patrick Wendell authored
      In particular when the HADOOP_CONF_DIR is not not specified.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #488 from pwendell/hadoop-cleanup and squashes the following commits:
      
      fe95f13 [Patrick Wendell] Changes based on Andrew's feeback
      18d09c1 [Patrick Wendell] Review comments from Andrew
      17929cc [Patrick Wendell] Assorted clean-up for Spark-on-YARN.
      995fdc96
    • Marcelo Vanzin's avatar
      Fix compilation on Hadoop 2.4.x. · 0ea0b1a2
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #483 from vanzin/yarn-2.4 and squashes the following commits:
      
      0fc57d8 [Marcelo Vanzin] Fix compilation on Hadoop 2.4.x.
      0ea0b1a2
  31. Apr 21, 2014
    • Patrick Wendell's avatar
      Clean up and simplify Spark configuration · fb98488f
      Patrick Wendell authored
      Over time as we've added more deployment modes, this have gotten a bit unwieldy with user-facing configuration options in Spark. Going forward we'll advise all users to run `spark-submit` to launch applications. This is a WIP patch but it makes the following improvements:
      
      1. Improved `spark-env.sh.template` which was missing a lot of things users now set in that file.
      2. Removes the shipping of SPARK_CLASSPATH, SPARK_JAVA_OPTS, and SPARK_LIBRARY_PATH to the executors on the cluster. This was an ugly hack. Instead it introduces config variables spark.executor.extraJavaOpts, spark.executor.extraLibraryPath, and spark.executor.extraClassPath.
      3. Adds ability to set these same variables for the driver using `spark-submit`.
      4. Allows you to load system properties from a `spark-defaults.conf` file when running `spark-submit`. This will allow setting both SparkConf options and other system properties utilized by `spark-submit`.
      5. Made `SPARK_LOCAL_IP` an environment variable rather than a SparkConf property. This is more consistent with it being set on each node.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #299 from pwendell/config-cleanup and squashes the following commits:
      
      127f301 [Patrick Wendell] Improvements to testing
      a006464 [Patrick Wendell] Moving properties file template.
      b4b496c [Patrick Wendell] spark-defaults.properties -> spark-defaults.conf
      0086939 [Patrick Wendell] Minor style fixes
      af09e3e [Patrick Wendell] Mention config file in docs and clean-up docs
      b16e6a2 [Patrick Wendell] Cleanup of spark-submit script and Scala quick start guide
      af0adf7 [Patrick Wendell] Automatically add user jar
      a56b125 [Patrick Wendell] Responses to Tom's review
      d50c388 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into config-cleanup
      a762901 [Patrick Wendell] Fixing test failures
      ffa00fe [Patrick Wendell] Review feedback
      fda0301 [Patrick Wendell] Note
      308f1f6 [Patrick Wendell] Properly escape quotes and other clean-up for YARN
      e83cd8f [Patrick Wendell] Changes to allow re-use of test applications
      be42f35 [Patrick Wendell] Handle case where SPARK_HOME is not set
      c2a2909 [Patrick Wendell] Test compile fixes
      4ee6f9d [Patrick Wendell] Making YARN doc changes consistent
      afc9ed8 [Patrick Wendell] Cleaning up line limits and two compile errors.
      b08893b [Patrick Wendell] Additional improvements.
      ace4ead [Patrick Wendell] Responses to review feedback.
      b72d183 [Patrick Wendell] Review feedback for spark env file
      46555c1 [Patrick Wendell] Review feedback and import clean-ups
      437aed1 [Patrick Wendell] Small fix
      761ebcd [Patrick Wendell] Library path and classpath for drivers
      7cc70e4 [Patrick Wendell] Clean up terminology inside of spark-env script
      5b0ba8e [Patrick Wendell] Don't ship executor envs
      84cc5e5 [Patrick Wendell] Small clean-up
      1f75238 [Patrick Wendell] SPARK_JAVA_OPTS --> SPARK_MASTER_OPTS for master settings
      4982331 [Patrick Wendell] Remove SPARK_LIBRARY_PATH
      6eaf7d0 [Patrick Wendell] executorJavaOpts
      0faa3b6 [Patrick Wendell] Stash of adding config options in submit script and YARN
      ac2d65e [Patrick Wendell] Change spark.local.dir -> SPARK_LOCAL_DIRS
      fb98488f
  32. Apr 17, 2014
    • Thomas Graves's avatar
      SPARK-1408 Modify Spark on Yarn to point to the history server when app ... · 0058b5d2
      Thomas Graves authored
      ...finishes
      
      Note this is dependent on https://github.com/apache/spark/pull/204 to have a working history server, but there are no code dependencies.
      
      This also fixes SPARK-1288 yarn stable finishApplicationMaster incomplete. Since I was in there I made the diagnostic message be passed properly.
      
      Author: Thomas Graves <tgraves@apache.org>
      
      Closes #362 from tgravescs/SPARK-1408 and squashes the following commits:
      
      ec89705 [Thomas Graves] Fix typo.
      446122d [Thomas Graves] Make config yarn specific
      f5d5373 [Thomas Graves] SPARK-1408 Modify Spark on Yarn to point to the history server when app finishes
      0058b5d2
    • Marcelo Vanzin's avatar
      [SPARK-1395] Allow "local:" URIs to work on Yarn. · 69047506
      Marcelo Vanzin authored
      This only works for the three paths defined in the environment
      (SPARK_JAR, SPARK_YARN_APP_JAR and SPARK_LOG4J_CONF).
      
      Tested by running SparkPi with local: and file: URIs against Yarn cluster (no "upload" shows up in logs in the local case).
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #303 from vanzin/yarn-local and squashes the following commits:
      
      82219c1 [Marcelo Vanzin] [SPARK-1395] Allow "local:" URIs to work on Yarn.
      69047506
  33. Apr 16, 2014
    • xuan's avatar
      SPARK-1465: Spark compilation is broken with the latest hadoop-2.4.0 release · 725925cf
      xuan authored
      YARN-1824 changes the APIs (addToEnvironment, setEnvFromInputString) in Apps, which causes the spark build to break if built against a version 2.4.0. To fix this, create the spark own function to do that functionality which will not break compiling against 2.3 and other 2.x versions.
      
      Author: xuan <xuan@MacBook-Pro.local>
      Author: xuan <xuan@macbook-pro.home>
      
      Closes #396 from xgong/master and squashes the following commits:
      
      42b5984 [xuan] Remove two extra imports
      bc0926f [xuan] Remove usage of org.apache.hadoop.util.Shell
      be89fa7 [xuan] fix Spark compilation is broken with the latest hadoop-2.4.0 release
      725925cf
Loading