Skip to content
Snippets Groups Projects
  1. May 22, 2014
  2. May 12, 2014
    • Sean Owen's avatar
      SPARK-1798. Tests should clean up temp files · 7120a297
      Sean Owen authored
      Three issues related to temp files that tests generate – these should be touched up for hygiene but are not urgent.
      
      Modules have a log4j.properties which directs the unit-test.log output file to a directory like `[module]/target/unit-test.log`. But this ends up creating `[module]/[module]/target/unit-test.log` instead of former.
      
      The `work/` directory is not deleted by "mvn clean", in the parent and in modules. Neither is the `checkpoint/` directory created under the various external modules.
      
      Many tests create a temp directory, which is not usually deleted. This can be largely resolved by calling `deleteOnExit()` at creation and trying to call `Utils.deleteRecursively` consistently to clean up, sometimes in an `@After` method.
      
      _If anyone seconds the motion, I can create a more significant change that introduces a new test trait along the lines of `LocalSparkContext`, which provides management of temp directories for subclasses to take advantage of._
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #732 from srowen/SPARK-1798 and squashes the following commits:
      
      5af578e [Sean Owen] Try to consistently delete test temp dirs and files, and set deleteOnExit() for each
      b21b356 [Sean Owen] Remove work/ and checkpoint/ dirs with mvn clean
      bdd0f41 [Sean Owen] Remove duplicate module dir in log4j.properties output path for tests
      7120a297
  3. May 06, 2014
    • Matei Zaharia's avatar
      [SPARK-1549] Add Python support to spark-submit · 951a5d93
      Matei Zaharia authored
      This PR updates spark-submit to allow submitting Python scripts (currently only with deploy-mode=client, but that's all that was supported before) and updates the PySpark code to properly find various paths, etc. One significant change is that we assume we can always find the Python files either from the Spark assembly JAR (which will happen with the Maven assembly build in make-distribution.sh) or from SPARK_HOME (which will exist in local mode even if you use sbt assembly, and should be enough for testing). This means we no longer need a weird hack to modify the environment for YARN.
      
      This patch also updates the Python worker manager to run python with -u, which means unbuffered output (send it to our logs right away instead of waiting a while after stuff was written); this should simplify debugging.
      
      In addition, it fixes https://issues.apache.org/jira/browse/SPARK-1709, setting the main class from a JAR's Main-Class attribute if not specified by the user, and fixes a few help strings and style issues in spark-submit.
      
      In the future we may want to make the `pyspark` shell use spark-submit as well, but it seems unnecessary for 1.0.
      
      Author: Matei Zaharia <matei@databricks.com>
      
      Closes #664 from mateiz/py-submit and squashes the following commits:
      
      15e9669 [Matei Zaharia] Fix some uses of path.separator property
      051278c [Matei Zaharia] Small style fixes
      0afe886 [Matei Zaharia] Add license headers
      4650412 [Matei Zaharia] Add pyFiles to PYTHONPATH in executors, remove old YARN stuff, add tests
      15f8e1e [Matei Zaharia] Set PYTHONPATH in PythonWorkerFactory in case it wasn't set from outside
      47c0655 [Matei Zaharia] More work to make spark-submit work with Python:
      d4375bd [Matei Zaharia] Clean up description of spark-submit args a bit and add Python ones
      951a5d93
  4. Apr 29, 2014
    • witgo's avatar
      Improved build configuration · 030f2c21
      witgo authored
      1, Fix SPARK-1441: compile spark core error with hadoop 0.23.x
      2, Fix SPARK-1491: maven hadoop-provided profile fails to build
      3, Fix org.scala-lang: * ,org.apache.avro:* inconsistent versions dependency
      4, A modified on the sql/catalyst/pom.xml,sql/hive/pom.xml,sql/core/pom.xml (Four spaces formatted into two spaces)
      
      Author: witgo <witgo@qq.com>
      
      Closes #480 from witgo/format_pom and squashes the following commits:
      
      03f652f [witgo] review commit
      b452680 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      bee920d [witgo] revert fix SPARK-1629: Spark Core missing commons-lang dependence
      7382a07 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      6902c91 [witgo] fix SPARK-1629: Spark Core missing commons-lang dependence
      0da4bc3 [witgo] merge master
      d1718ed [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      e345919 [witgo] add avro dependency to yarn-alpha
      77fad08 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      62d0862 [witgo] Fix org.scala-lang: * inconsistent versions dependency
      1a162d7 [witgo] Merge branch 'master' of https://github.com/apache/spark into format_pom
      934f24d [witgo] review commit
      cf46edc [witgo] exclude jruby
      06e7328 [witgo] Merge branch 'SparkBuild' into format_pom
      99464d2 [witgo] fix maven hadoop-provided profile fails to build
      0c6c1fc [witgo] Fix compile spark core error with hadoop 0.23.x
      6851bec [witgo] Maintain consistent SparkBuild.scala, pom.xml
      030f2c21
  5. Apr 25, 2014
    • Patrick Wendell's avatar
      SPARK-1619 Launch spark-shell with spark-submit · dc3b640a
      Patrick Wendell authored
      This simplifies the shell a bunch and passes all arguments through to spark-submit.
      
      There is a tiny incompatibility from 0.9.1 which is that you can't put `-c` _or_ `--cores`, only `--cores`. However, spark-submit will give a good error message in this case, I don't think many people used this, and it's a trivial change for users.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #542 from pwendell/spark-shell and squashes the following commits:
      
      9eb3e6f [Patrick Wendell] Updating Spark docs
      b552459 [Patrick Wendell] Andrew's feedback
      97720fa [Patrick Wendell] Review feedback
      aa2900b [Patrick Wendell] SPARK-1619 Launch spark-shell with spark-submit
      dc3b640a
  6. Apr 24, 2014
    • Mridul Muralidharan's avatar
      SPARK-1586 Windows build fixes · 968c0187
      Mridul Muralidharan authored
      Unfortunately, this is not exhaustive - particularly hive tests still fail due to path issues.
      
      Author: Mridul Muralidharan <mridulm80@apache.org>
      
      This patch had conflicts when merged, resolved by
      Committer: Matei Zaharia <matei@databricks.com>
      
      Closes #505 from mridulm/windows_fixes and squashes the following commits:
      
      ef12283 [Mridul Muralidharan] Move to org.apache.commons.lang3 for StringEscapeUtils. Earlier version was buggy appparently
      cdae406 [Mridul Muralidharan] Remove leaked changes from > 2G fix branch
      3267f4b [Mridul Muralidharan] Fix build failures
      35b277a [Mridul Muralidharan] Fix Scalastyle failures
      bc69d14 [Mridul Muralidharan] Change from hardcoded path separator
      10c4d78 [Mridul Muralidharan] Use explicit encoding while using getBytes
      1337abd [Mridul Muralidharan] fix classpath while running in windows
      968c0187
    • Sandeep's avatar
      Fix Scala Style · a03ac222
      Sandeep authored
      Any comments are welcome
      
      Author: Sandeep <sandeep@techaddict.me>
      
      Closes #531 from techaddict/stylefix-1 and squashes the following commits:
      
      7492730 [Sandeep] Pass 4
      98b2428 [Sandeep] fix rxin suggestions
      b5e2e6f [Sandeep] Pass 3
      05932d7 [Sandeep] fix if else styling 2
      08690e5 [Sandeep] fix if else styling
      a03ac222
  7. Apr 19, 2014
    • Michael Armbrust's avatar
      REPL cleanup. · 3a390bfd
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #451 from marmbrus/replCleanup and squashes the following commits:
      
      088526a [Michael Armbrust] REPL cleanup.
      3a390bfd
  8. Apr 13, 2014
    • Patrick Wendell's avatar
      SPARK-1480: Clean up use of classloaders · 4bc07eeb
      Patrick Wendell authored
      The Spark codebase is a bit fast-and-loose when accessing classloaders and this has caused a few bugs to surface in master.
      
      This patch defines some utility methods for accessing classloaders. This makes the intention when accessing a classloader much more explicit in the code and fixes a few cases where the wrong one was chosen.
      
      case (a) -> We want the classloader that loaded Spark
      case (b) -> We want the context class loader, or if not present, we want (a)
      
      This patch provides a better fix for SPARK-1403 (https://issues.apache.org/jira/browse/SPARK-1403) than the current work around, which it reverts. It also fixes a previously unreported bug that the `./spark-submit` script did not work for running with `local` master. It didn't work because the executor classloader did not properly delegate to the context class loader (if it is defined) and in local mode the context class loader is set by the `./spark-submit` script. A unit test is added for that case.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #398 from pwendell/class-loaders and squashes the following commits:
      
      b4a1a58 [Patrick Wendell] Minor clean up
      14f1272 [Patrick Wendell] SPARK-1480: Clean up use of classloaders
      4bc07eeb
  9. Apr 10, 2014
    • Sandeep's avatar
      Remove Unnecessary Whitespace's · 930b70f0
      Sandeep authored
      stack these together in a commit else they show up chunk by chunk in different commits.
      
      Author: Sandeep <sandeep@techaddict.me>
      
      Closes #380 from techaddict/white_space and squashes the following commits:
      
      b58f294 [Sandeep] Remove Unnecessary Whitespace's
      930b70f0
    • Andrew Or's avatar
      [SPARK-1276] Add a HistoryServer to render persisted UI · 79820fe8
      Andrew Or authored
      The new feature of event logging, introduced in #42, allows the user to persist the details of his/her Spark application to storage, and later replay these events to reconstruct an after-the-fact SparkUI.
      Currently, however, a persisted UI can only be rendered through the standalone Master. This greatly limits the use case of this new feature as many people also run Spark on Yarn / Mesos.
      
      This PR introduces a new entity called the HistoryServer, which, given a log directory, keeps track of all completed applications independently of a Spark Master. Unlike Master, the HistoryServer needs not be running while the application is still running. It is relatively light-weight in that it only maintains static information of applications and performs no scheduling.
      
      To quickly test it out, generate event logs with ```spark.eventLog.enabled=true``` and run ```sbin/start-history-server.sh <log-dir-path>```. Your HistoryServer awaits on port 18080.
      
      Comments and feedback are most welcome.
      
      ---
      
      A few other changes introduced in this PR include refactoring the WebUI interface, which is beginning to have a lot of duplicate code now that we have added more functionality to it. Two new SparkListenerEvents have been introduced (SparkListenerApplicationStart/End) to keep track of application name and start/finish times. This PR also clarifies the semantics of the ReplayListenerBus introduced in #42.
      
      A potential TODO in the future (not part of this PR) is to render live applications in addition to just completed applications. This is useful when applications fail, a condition that our current HistoryServer does not handle unless the user manually signals application completion (by creating the APPLICATION_COMPLETION file). Handling live applications becomes significantly more challenging, however, because it is now necessary to render the same SparkUI multiple times. To avoid reading the entire log every time, which is inefficient, we must handle reading the log from where we previously left off, but this becomes fairly complicated because we must deal with the arbitrary behavior of each input stream.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #204 from andrewor14/master and squashes the following commits:
      
      7b7234c [Andrew Or] Finished -> Completed
      b158d98 [Andrew Or] Address Patrick's comments
      69d1b41 [Andrew Or] Do not block on posting SparkListenerApplicationEnd
      19d5dd0 [Andrew Or] Merge github.com:apache/spark
      f7f5bf0 [Andrew Or] Make history server's web UI port a Spark configuration
      2dfb494 [Andrew Or] Decouple checking for application completion from replaying
      d02dbaa [Andrew Or] Expose Spark version and include it in event logs
      2282300 [Andrew Or] Add documentation for the HistoryServer
      567474a [Andrew Or] Merge github.com:apache/spark
      6edf052 [Andrew Or] Merge github.com:apache/spark
      19e1fb4 [Andrew Or] Address Thomas' comments
      248cb3d [Andrew Or] Limit number of live applications + add configurability
      a3598de [Andrew Or] Do not close file system with ReplayBus + fix bind address
      bc46fc8 [Andrew Or] Merge github.com:apache/spark
      e2f4ff9 [Andrew Or] Merge github.com:apache/spark
      050419e [Andrew Or] Merge github.com:apache/spark
      81b568b [Andrew Or] Fix strange error messages...
      0670743 [Andrew Or] Decouple page rendering from loading files from disk
      1b2f391 [Andrew Or] Minor changes
      a9eae7e [Andrew Or] Merge branch 'master' of github.com:apache/spark
      d5154da [Andrew Or] Styling and comments
      5dbfbb4 [Andrew Or] Merge branch 'master' of github.com:apache/spark
      60bc6d5 [Andrew Or] First complete implementation of HistoryServer (only for finished apps)
      7584418 [Andrew Or] Report application start/end times to HistoryServer
      8aac163 [Andrew Or] Add basic application table
      c086bd5 [Andrew Or] Add HistoryServer and scripts ++ Refactor WebUI interface
      79820fe8
  10. Apr 09, 2014
    • Holden Karau's avatar
      Spark-939: allow user jars to take precedence over spark jars · fa0524fd
      Holden Karau authored
      I still need to do a small bit of re-factoring [mostly the one Java file I'll switch it back to a Scala file and use it in both the close loaders], but comments on other things I should do would be great.
      
      Author: Holden Karau <holden@pigscanfly.ca>
      
      Closes #217 from holdenk/spark-939-allow-user-jars-to-take-precedence-over-spark-jars and squashes the following commits:
      
      cf0cac9 [Holden Karau] Fix the executorclassloader
      1955232 [Holden Karau] Fix long line in TestUtils
      8f89965 [Holden Karau] Fix tests for new class name
      7546549 [Holden Karau] CR feedback, merge some of the testutils methods down, rename the classloader
      644719f [Holden Karau] User the class generator for the repl class loader tests too
      f0b7114 [Holden Karau] Fix the core/src/test/scala/org/apache/spark/executor/ExecutorURLClassLoaderSuite.scala tests
      204b199 [Holden Karau] Fix the generated classes
      9f68f10 [Holden Karau] Start rewriting the ExecutorURLClassLoaderSuite to not use the hard coded classes
      858aba2 [Holden Karau] Remove a bunch of test junk
      261aaee [Holden Karau] simplify executorurlclassloader a bit
      7a7bf5f [Holden Karau] CR feedback
      d4ae848 [Holden Karau] rewrite component into scala
      aa95083 [Holden Karau] CR feedback
      7752594 [Holden Karau] re-add https comment
      a0ef85a [Holden Karau] Fix style issues
      125ea7f [Holden Karau] Easier to just remove those files, we don't need them
      bb8d179 [Holden Karau] Fix issues with the repl class loader
      241b03d [Holden Karau] fix my rat excludes
      a343350 [Holden Karau] Update rat-excludes and remove a useless file
      d90d217 [Holden Karau] Fix fall back with custom class loader and add a test for it
      4919bf9 [Holden Karau] Fix parent calling class loader issue
      8a67302 [Holden Karau] Test are good
      9e2d236 [Holden Karau] It works comrade
      691ee00 [Holden Karau] It works ish
      dc4fe44 [Holden Karau] Does not depend on being in my home directory
      47046ff [Holden Karau] Remove bad import'
      22d83cb [Holden Karau] Add a test suite for the executor url class loader suite
      7ef4628 [Holden Karau] Clean up
      792d961 [Holden Karau] Almost works
      16aecd1 [Holden Karau] Doesn't quite work
      8d2241e [Holden Karau] Adda FakeClass for testing ClassLoader precedence options
      648b559 [Holden Karau] Both class loaders compile. Now for testing
      e1d9f71 [Holden Karau] One loader workers.
      fa0524fd
  11. Apr 07, 2014
    • Aaron Davidson's avatar
      SPARK-1099: Introduce local[*] mode to infer number of cores · 0307db0f
      Aaron Davidson authored
      This is the default mode for running spark-shell and pyspark, intended to allow users running spark for the first time to see the performance benefits of using multiple cores, while not breaking backwards compatibility for users who use "local" mode and expect exactly 1 core.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #182 from aarondav/110 and squashes the following commits:
      
      a88294c [Aaron Davidson] Rebased changes for new spark-shell
      a9f393e [Aaron Davidson] SPARK-1099: Introduce local[*] mode to infer number of cores
      0307db0f
  12. Apr 06, 2014
  13. Mar 28, 2014
    • Prashant Sharma's avatar
      SPARK-1096, a space after comment start style checker. · 60abc252
      Prashant Sharma authored
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #124 from ScrapCodes/SPARK-1096/scalastyle-comment-check and squashes the following commits:
      
      214135a [Prashant Sharma] Review feedback.
      5eba88c [Prashant Sharma] Fixed style checks for ///+ comments.
      e54b2f8 [Prashant Sharma] improved message, work around.
      83e7144 [Prashant Sharma] removed dependency on scalastyle in plugin, since scalastyle sbt plugin already depends on the right version. Incase we update the plugin we will have to adjust our spark-style project to depend on right scalastyle version.
      810a1d6 [Prashant Sharma] SPARK-1096, a space after comment style checker.
      ba33193 [Prashant Sharma] scala style as a project
      60abc252
    • Takuya UESHIN's avatar
      [SPARK-1210] Prevent ContextClassLoader of Actor from becoming ClassLoader of Executo... · 3d89043b
      Takuya UESHIN authored
      ...r.
      
      Constructor of `org.apache.spark.executor.Executor` should not set context class loader of current thread, which is backend Actor's thread.
      
      Run the following code in local-mode REPL.
      
      ```
      scala> case class Foo(i: Int)
      scala> val ret = sc.parallelize((1 to 100).map(Foo), 10).collect
      ```
      
      This causes errors as follows:
      
      ```
      ERROR actor.OneForOneStrategy: [L$line5.$read$$iwC$$iwC$$iwC$$iwC$Foo;
      java.lang.ArrayStoreException: [L$line5.$read$$iwC$$iwC$$iwC$$iwC$Foo;
           at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
           at org.apache.spark.SparkContext$$anonfun$runJob$3.apply(SparkContext.scala:870)
           at org.apache.spark.SparkContext$$anonfun$runJob$3.apply(SparkContext.scala:870)
           at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56)
           at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:859)
           at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:616)
           at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207)
           at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
           at akka.actor.ActorCell.invoke(ActorCell.scala:456)
           at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
           at akka.dispatch.Mailbox.run(Mailbox.scala:219)
           at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
           at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
           at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
           at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
           at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      ```
      
      This is because the class loaders to deserialize result `Foo` instances might be different from backend Actor's, and the Actor's class loader should be the same as Driver's.
      
      Author: Takuya UESHIN <ueshin@happy-camper.st>
      
      Closes #15 from ueshin/wip/wrongcontextclassloader and squashes the following commits:
      
      d79e8c0 [Takuya UESHIN] Change a parent class loader of ExecutorURLClassLoader.
      c6c09b6 [Takuya UESHIN] Add a test to collect objects of class defined in repl.
      43e0feb [Takuya UESHIN] Prevent ContextClassLoader of Actor from becoming ClassLoader of Executor.
      3d89043b
  14. Mar 26, 2014
    • Sean Owen's avatar
      SPARK-1325. The maven build error for Spark Tools · 1fa48d94
      Sean Owen authored
      This is just a slight variation on https://github.com/apache/spark/pull/234 and alternative suggestion for SPARK-1325. `scala-actors` is not necessary. `SparkBuild.scala` should be updated to reflect the direct dependency on `scala-reflect` and `scala-compiler`. And the `repl` build, which has the same dependencies, should also be consistent between Maven / SBT.
      
      Author: Sean Owen <sowen@cloudera.com>
      Author: witgo <witgo@qq.com>
      
      Closes #240 from srowen/SPARK-1325 and squashes the following commits:
      
      25bd7db [Sean Owen] Add necessary dependencies scala-reflect and scala-compiler to tools. Update repl dependencies, which are similar, to be consistent between Maven / SBT in this regard too.
      1fa48d94
  15. Mar 09, 2014
    • Patrick Wendell's avatar
      SPARK-782 Clean up for ASM dependency. · b9be1609
      Patrick Wendell authored
      This makes two changes.
      
      1) Spark uses the shaded version of asm that is (conveniently) published
         with Kryo.
      2) Existing exclude rules around asm are updated to reflect the new groupId
         of `org.ow2.asm`. This made all of the old rules not work with newer Hadoop
         versions that pull in new asm versions.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #100 from pwendell/asm and squashes the following commits:
      
      9235f3f [Patrick Wendell] SPARK-782 Clean up for ASM dependency.
      b9be1609
  16. Mar 08, 2014
    • Sandy Ryza's avatar
      SPARK-1193. Fix indentation in pom.xmls · a99fb374
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #91 from sryza/sandy-spark-1193 and squashes the following commits:
      
      a878124 [Sandy Ryza] SPARK-1193. Fix indentation in pom.xmls
      a99fb374
  17. Mar 06, 2014
    • Thomas Graves's avatar
      SPARK-1189: Add Security to Spark - Akka, Http, ConnectionManager, UI use servlets · 7edbea41
      Thomas Graves authored
      resubmit pull request.  was https://github.com/apache/incubator-spark/pull/332.
      
      Author: Thomas Graves <tgraves@apache.org>
      
      Closes #33 from tgravescs/security-branch-0.9-with-client-rebase and squashes the following commits:
      
      dfe3918 [Thomas Graves] Fix merge conflict since startUserClass now using runAsUser
      05eebed [Thomas Graves] Fix dependency lost in upmerge
      d1040ec [Thomas Graves] Fix up various imports
      05ff5e0 [Thomas Graves] Fix up imports after upmerging to master
      ac046b3 [Thomas Graves] Merge remote-tracking branch 'upstream/master' into security-branch-0.9-with-client-rebase
      13733e1 [Thomas Graves] Pass securityManager and SparkConf around where we can. Switch to use sparkConf for reading config whereever possible. Added ConnectionManagerSuite unit tests.
      4a57acc [Thomas Graves] Change UI createHandler routines to createServlet since they now return servlets
      2f77147 [Thomas Graves] Rework from comments
      50dd9f2 [Thomas Graves] fix header in SecurityManager
      ecbfb65 [Thomas Graves] Fix spacing and formatting
      b514bec [Thomas Graves] Fix reference to config
      ed3d1c1 [Thomas Graves] Add security.md
      6f7ddf3 [Thomas Graves] Convert SaslClient and SaslServer to scala, change spark.authenticate.ui to spark.ui.acls.enable, and fix up various other things from review comments
      2d9e23e [Thomas Graves] Merge remote-tracking branch 'upstream/master' into security-branch-0.9-with-client-rebase_rework
      5721c5a [Thomas Graves] update AkkaUtilsSuite test for the actorSelection changes, fix typos based on comments, and remove extra lines I missed in rebase from AkkaUtils
      f351763 [Thomas Graves] Add Security to Spark - Akka, Http, ConnectionManager, UI to use servlets
      7edbea41
  18. Mar 02, 2014
    • Patrick Wendell's avatar
      SPARK-1121: Include avro for yarn-alpha builds · c3f5e075
      Patrick Wendell authored
      This lets us explicitly include Avro based on a profile for 0.23.X
      builds. It makes me sad how convoluted it is to express this logic
      in Maven. @tgraves and @sryza curious if this works for you.
      
      I'm also considering just reverting to how it was before. The only
      real problem was that Spark advertised a dependency on Avro
      even though it only really depends transitively on Avro through
      other deps.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #49 from pwendell/avro-build-fix and squashes the following commits:
      
      8d6ee92 [Patrick Wendell] SPARK-1121: Add avro to yarn-alpha profile
      c3f5e075
    • Patrick Wendell's avatar
      Remove remaining references to incubation · 1fd2bfd3
      Patrick Wendell authored
      This removes some loose ends not caught by the other (incubating -> tlp) patches. @markhamstra this updates the version as you mentioned earlier.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #51 from pwendell/tlp and squashes the following commits:
      
      d553b1b [Patrick Wendell] Remove remaining references to incubation
      1fd2bfd3
  19. Feb 27, 2014
  20. Feb 23, 2014
    • Sean Owen's avatar
      SPARK-1071: Tidy logging strategy and use of log4j · c0ef3afa
      Sean Owen authored
      Prompted by a recent thread on the mailing list, I tried and failed to see if Spark can be made independent of log4j. There are a few cases where control of the underlying logging is pretty useful, and to do that, you have to bind to a specific logger.
      
      Instead I propose some tidying that leaves Spark's use of log4j, but gets rid of warnings and should still enable downstream users to switch. The idea is to pipe everything (except log4j) through SLF4J, and have Spark use SLF4J directly when logging, and where Spark needs to output info (REPL and tests), bind from SLF4J to log4j.
      
      This leaves the same behavior in Spark. It means that downstream users who want to use something except log4j should:
      
      - Exclude dependencies on log4j, slf4j-log4j12 from Spark
      - Include dependency on log4j-over-slf4j
      - Include dependency on another logger X, and another slf4j-X
      - Recreate any log config that Spark does, that is needed, in the other logger's config
      
      That sounds about right.
      
      Here are the key changes:
      
      - Include the jcl-over-slf4j shim everywhere by depending on it in core.
      - Exclude dependencies on commons-logging from third-party libraries.
      - Include the jul-to-slf4j shim everywhere by depending on it in core.
      - Exclude slf4j-* dependencies from third-party libraries to prevent collision or warnings
      - Added missing slf4j-log4j12 binding to GraphX, Bagel module tests
      
      And minor/incidental changes:
      
      - Update to SLF4J 1.7.5, which happily matches Hadoop 2’s version and is a recommended update over 1.7.2
      - (Remove a duplicate HBase dependency declaration in SparkBuild.scala)
      - (Remove a duplicate mockito dependency declaration that was causing warnings and bugging me)
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #570 from srowen/SPARK-1071 and squashes the following commits:
      
      52eac9f [Sean Owen] Add slf4j-over-log4j12 dependency to core (non-test) and remove it from things that depend on core.
      77a7fa9 [Sean Owen] SPARK-1071: Tidy logging strategy and use of log4j
      c0ef3afa
  21. Feb 17, 2014
    • CodingCat's avatar
      [SPARK-1090] improvement on spark_shell (help information, configure memory) · e0d49ad2
      CodingCat authored
      https://spark-project.atlassian.net/browse/SPARK-1090
      
      spark-shell should print help information about parameters and should allow user to configure exe memory
      there is no document about hot to set --cores/-c in spark-shell
      
      and also
      
      users should be able to set executor memory through command line options
      
      In this PR I also check the format of the options passed by the user
      
      Author: CodingCat <zhunansjtu@gmail.com>
      
      Closes #599 from CodingCat/spark_shell_improve and squashes the following commits:
      
      de5aa38 [CodingCat] add parameter to set driver memory
      915cbf8 [CodingCat] improvement on spark_shell (help information, configure memory)
      e0d49ad2
  22. Feb 09, 2014
    • Patrick Wendell's avatar
      Merge pull request #557 from ScrapCodes/style. Closes #557. · b69f8b2a
      Patrick Wendell authored
      SPARK-1058, Fix Style Errors and Add Scala Style to Spark Build.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      Author: Prashant Sharma <scrapcodes@gmail.com>
      
      == Merge branch commits ==
      
      commit 1a8bd1c059b842cb95cc246aaea74a79fec684f4
      Author: Prashant Sharma <scrapcodes@gmail.com>
      Date:   Sun Feb 9 17:39:07 2014 +0530
      
          scala style fixes
      
      commit f91709887a8e0b608c5c2b282db19b8a44d53a43
      Author: Patrick Wendell <pwendell@gmail.com>
      Date:   Fri Jan 24 11:22:53 2014 -0800
      
          Adding scalastyle snapshot
      b69f8b2a
  23. Feb 08, 2014
    • Mark Hamstra's avatar
      Merge pull request #542 from markhamstra/versionBump. Closes #542. · c2341c92
      Mark Hamstra authored
      Version number to 1.0.0-SNAPSHOT
      
      Since 0.9.0-incubating is done and out the door, we shouldn't be building 0.9.0-incubating-SNAPSHOT anymore.
      
      @pwendell
      
      Author: Mark Hamstra <markhamstra@gmail.com>
      
      == Merge branch commits ==
      
      commit 1b00a8a7c1a7f251b4bb3774b84b9e64758eaa71
      Author: Mark Hamstra <markhamstra@gmail.com>
      Date:   Wed Feb 5 09:30:32 2014 -0800
      
          Version number to 1.0.0-SNAPSHOT
      c2341c92
  24. Jan 14, 2014
  25. Jan 12, 2014
  26. Jan 10, 2014
    • Ankur Dave's avatar
      Revert GraphX changes to SparkILoopInit · 0ca18b8b
      Ankur Dave authored
      The changes were to support a custom banner in spark-shell for use by
      graphx-shell, but once GraphX is merged into Spark, a separate shell
      will be unnecessary.
      0ca18b8b
  27. Jan 07, 2014
  28. Jan 03, 2014
    • Luca Rosellini's avatar
      Added ‘-i’ command line option to spark REPL. · 0b6db8c1
      Luca Rosellini authored
      We had to create a new implementation of both scala.tools.nsc.CompilerCommand and scala.tools.nsc.Settings, because using scala.tools.nsc.GenericRunnerSettings would bring in other options (-howtorun, -save and -execute) which don’t make sense in Spark.
      Any new Spark specific command line option could now be added to org.apache.spark.repl.SparkRunnerSettings class.
      
      Since the behavior of loading a script from the command line should be the same as loading it using the “:load” command inside the shell, the script should be loaded when the SparkContext is available, that’s why we had to move the call to ‘loadfiles(settings)’ _after_ the call to postInitialization(). This still doesn’t work if ‘isAsync = true’.
      0b6db8c1
    • Prashant Sharma's avatar
      fixed review comments · 94f2fffa
      Prashant Sharma authored
      94f2fffa
  29. Jan 01, 2014
  30. Dec 31, 2013
  31. Dec 30, 2013
    • Patrick Wendell's avatar
      SPARK-1008: Logging improvments · cffe1c1d
      Patrick Wendell authored
      1. Adds a default log4j file that gets loaded if users haven't specified a log4j file.
      2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings
         after building with SBT (and I've seen similar warnings on the mailing list).
      cffe1c1d
  32. Dec 28, 2013
    • Matei Zaharia's avatar
      Various fixes to configuration code · 642029e7
      Matei Zaharia authored
      - Got rid of global SparkContext.globalConf
      - Pass SparkConf to serializers and compression codecs
      - Made SparkConf public instead of private[spark]
      - Improved API of SparkContext and SparkConf
      - Switched executor environment vars to be passed through SparkConf
      - Fixed some places that were still using system properties
      - Fixed some tests, though others are still failing
      
      This still fails several tests in core, repl and streaming, likely due
      to properties not being set or cleared correctly (some of the tests run
      fine in isolation).
      642029e7
  33. Dec 24, 2013
Loading