Skip to content
Snippets Groups Projects
  1. Jun 02, 2016
    • hyukjinkwon's avatar
      [SPARK-15322][SQL][FOLLOWUP] Use the new long accumulator for old int accumulators. · 252417fa
      hyukjinkwon authored
      ## What changes were proposed in this pull request?
      
      This PR corrects the remaining cases for using old accumulators.
      
      This does not change some old accumulator usages below:
      
      - `ImplicitSuite.scala` - Tests dedicated to old accumulator, for implicits with `AccumulatorParam`
      
      - `AccumulatorSuite.scala` -  Tests dedicated to old accumulator
      
      - `JavaSparkContext.scala` - For supporting old accumulators for Java API.
      
      - `debug.package.scala` - Usage with `HashSet[String]`. Currently, it seems no implementation for this. I might be able to write an anonymous class for this but I didn't because I think it is not worth writing a lot of codes only for this.
      
      - `SQLMetricsSuite.scala` - This uses the old accumulator for checking type boxing. It seems new accumulator does not require type boxing for this case whereas the old one requires (due to the use of generic).
      
      ## How was this patch tested?
      
      Existing tests cover this.
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      
      Closes #13434 from HyukjinKwon/accum.
      252417fa
  2. May 31, 2016
  3. May 17, 2016
  4. May 04, 2016
  5. May 03, 2016
  6. Apr 28, 2016
  7. Apr 25, 2016
    • Andrew Or's avatar
      [SPARK-14828][SQL] Start SparkSession in REPL instead of SQLContext · 34336b62
      Andrew Or authored
      ## What changes were proposed in this pull request?
      
      ```
      Spark context available as 'sc' (master = local[*], app id = local-1461283768192).
      Spark session available as 'spark'.
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 2.0.0-SNAPSHOT
            /_/
      
      Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51)
      Type in expressions to have them evaluated.
      Type :help for more information.
      
      scala> sql("SHOW TABLES").collect()
      16/04/21 17:09:39 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
      16/04/21 17:09:39 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
      res0: Array[org.apache.spark.sql.Row] = Array([src,false])
      
      scala> sql("SHOW TABLES").collect()
      res1: Array[org.apache.spark.sql.Row] = Array([src,false])
      
      scala> spark.createDataFrame(Seq((1, 1), (2, 2), (3, 3)))
      res2: org.apache.spark.sql.DataFrame = [_1: int, _2: int]
      ```
      
      Hive things are loaded lazily.
      
      ## How was this patch tested?
      
      Manual.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #12589 from andrewor14/spark-session-repl.
      34336b62
  8. Apr 22, 2016
    • Reynold Xin's avatar
      [SPARK-10001] Consolidate Signaling and SignalLogger. · c089c6f4
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      This is a follow-up to #12557, with the following changes:
      
      1. Fixes some of the style issues.
      2. Merges Signaling and SignalLogger into a new class called SignalUtils. It was pretty confusing to have Signaling and Signal in one file, and it was also confusing to have two classes named Signaling and one called the other.
      3. Made logging registration idempotent.
      
      ## How was this patch tested?
      N/A.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #12605 from rxin/SPARK-10001.
      c089c6f4
    • Jakob Odersky's avatar
      [SPARK-10001] [CORE] Interrupt tasks in repl with Ctrl+C · 80127935
      Jakob Odersky authored
      ## What changes were proposed in this pull request?
      
      Improve signal handling to allow interrupting running tasks from the REPL (with Ctrl+C).
      If no tasks are running or Ctrl+C is pressed twice, the signal is forwarded to the default handler resulting in the usual termination of the application.
      
      This PR is a rewrite of -- and therefore closes #8216 -- as per piaozhexiu's request
      
      ## How was this patch tested?
      Signal handling is not easily testable therefore no unit tests were added. Nevertheless, the new functionality is implemented in a best-effort approach, soft-failing in case signals aren't available on a specific OS.
      
      Author: Jakob Odersky <jakob@odersky.com>
      
      Closes #12557 from jodersky/SPARK-10001-sigint.
      80127935
  9. Apr 20, 2016
    • jerryshao's avatar
      [SPARK-14725][CORE] Remove HttpServer class · 90cbc82f
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      This proposal removes the class `HttpServer`, with the changing of internal file/jar/class transmission to RPC layer, currently there's no code using this `HttpServer`, so here propose to remove it.
      
      ## How was this patch tested?
      
      Unit test is verified locally.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #12526 from jerryshao/SPARK-14725.
      90cbc82f
  10. Apr 14, 2016
    • Wenchen Fan's avatar
      [SPARK-14558][CORE] In ClosureCleaner, clean the outer pointer if it's a REPL line object · 1d04c86f
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      When we clean a closure, if its outermost parent is not a closure, we won't clone and clean it as cloning user's objects is dangerous. However, if it's a REPL line object, which may carry a lot of unnecessary references(like hadoop conf, spark conf, etc.), we should clean it as it's not a user object.
      
      This PR improves the check for user's objects to exclude REPL line object.
      
      ## How was this patch tested?
      
      existing tests.
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #12327 from cloud-fan/closure.
      1d04c86f
  11. Apr 12, 2016
  12. Apr 09, 2016
    • Reynold Xin's avatar
      [SPARK-14451][SQL] Move encoder definition into Aggregator interface · 520dde48
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      When we first introduced Aggregators, we required the user of Aggregators to (implicitly) specify the encoders. It would actually make more sense to have the encoders be specified by the implementation of Aggregators, since each implementation should have the most state about how to encode its own data type.
      
      Note that this simplifies the Java API because Java users no longer need to explicitly specify encoders for aggregators.
      
      ## How was this patch tested?
      Updated unit tests.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #12231 from rxin/SPARK-14451.
      520dde48
  13. Apr 06, 2016
    • Marcelo Vanzin's avatar
      [SPARK-14134][CORE] Change the package name used for shading classes. · 21d5ca12
      Marcelo Vanzin authored
      The current package name uses a dash, which is a little weird but seemed
      to work. That is, until a new test tried to mock a class that references
      one of those shaded types, and then things started failing.
      
      Most changes are just noise to fix the logging configs.
      
      For reference, SPARK-8815 also raised this issue, although at the time it
      did not cause any issues in Spark, so it was not addressed.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #11941 from vanzin/SPARK-14134.
      21d5ca12
    • Marcelo Vanzin's avatar
      [SPARK-14446][TESTS] Fix ReplSuite for Scala 2.10. · 4901086f
      Marcelo Vanzin authored
      Just use the same test code as the 2.11 version, which seems to pass.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #12223 from vanzin/SPARK-14446.
      4901086f
  14. Apr 02, 2016
    • Dongjoon Hyun's avatar
      [MINOR][DOCS] Use multi-line JavaDoc comments in Scala code. · 4a6e78ab
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR aims to fix all Scala-Style multiline comments into Java-Style multiline comments in Scala codes.
      (All comment-only changes over 77 files: +786 lines, −747 lines)
      
      ## How was this patch tested?
      
      Manual.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12130 from dongjoon-hyun/use_multiine_javadoc_comments.
      4a6e78ab
  15. Mar 28, 2016
    • Dongjoon Hyun's avatar
      [SPARK-14102][CORE] Block `reset` command in SparkShell · b66aa900
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      Spark Shell provides an easy way to use Spark in Scala environment. This PR adds `reset` command to a blocked list, also cleaned up according to the Scala coding style.
      ```scala
      scala> sc
      res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext718fad24
      scala> :reset
      scala> sc
      <console>:11: error: not found: value sc
             sc
             ^
      ```
      If we blocks `reset`, Spark Shell works like the followings.
      ```scala
      scala> :reset
      reset: no such command.  Type :help for help.
      scala> :re
      re is ambiguous: did you mean :replay or :require?
      ```
      
      ## How was this patch tested?
      
      Manual. Run `bin/spark-shell` and type `:reset`.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11920 from dongjoon-hyun/SPARK-14102.
      b66aa900
  16. Mar 25, 2016
  17. Mar 21, 2016
    • Wenchen Fan's avatar
      [SPARK-13456][SQL] fix creating encoders for case classes defined in Spark shell · 43ebf7a9
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      case classes defined in REPL are wrapped by line classes, and we have a trick for scala 2.10 REPL to automatically register the wrapper classes to `OuterScope` so that we can use when create encoders.
      However, this trick doesn't work right after we upgrade to scala 2.11, and unfortunately the tests are only in scala 2.10, which makes this bug hidden until now.
      
      This PR moves the encoder tests to scala 2.11  `ReplSuite`, and fixes this bug by another approach(the previous trick can't port to scala 2.11 REPL): make `OuterScope` smarter that can detect classes defined in REPL and load the singleton of line wrapper classes automatically.
      
      ## How was this patch tested?
      
      the migrated encoder tests in `ReplSuite`
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #11410 from cloud-fan/repl.
      43ebf7a9
  18. Mar 17, 2016
  19. Mar 14, 2016
    • Marcelo Vanzin's avatar
      [SPARK-13626][CORE] Avoid duplicate config deprecation warnings. · 8301fadd
      Marcelo Vanzin authored
      Three different things were needed to get rid of spurious warnings:
      - silence deprecation warnings when cloning configuration
      - change the way SparkHadoopUtil instantiates SparkConf to silence
        warnings
      - avoid creating new SparkConf instances where it's not needed.
      
      On top of that, I changed the way that Logging.scala detects the repl;
      now it uses a method that is overridden in the repl's Main class, and
      the hack in Utils.scala is not needed anymore. This makes the 2.11 repl
      behave like the 2.10 one and set the default log level to WARN, which
      is a lot better. Previously, this wasn't working because the 2.11 repl
      triggers log initialization earlier than the 2.10 one.
      
      I also removed and simplified some other code in the 2.11 repl's Main
      to avoid replicating logic that already exists elsewhere in Spark.
      
      Tested the 2.11 repl in local and yarn modes.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #11510 from vanzin/SPARK-13626.
      8301fadd
  20. Mar 10, 2016
    • Dongjoon Hyun's avatar
      [SPARK-3854][BUILD] Scala style: require spaces before `{`. · 91fed8e9
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      Since the opening curly brace, '{', has many usages as discussed in [SPARK-3854](https://issues.apache.org/jira/browse/SPARK-3854), this PR adds a ScalaStyle rule to prevent '){' pattern  for the following majority pattern and fixes the code accordingly. If we enforce this in ScalaStyle from now, it will improve the Scala code quality and reduce review time.
      ```
      // Correct:
      if (true) {
        println("Wow!")
      }
      
      // Incorrect:
      if (true){
         println("Wow!")
      }
      ```
      IntelliJ also shows new warnings based on this.
      
      ## How was this patch tested?
      
      Pass the Jenkins ScalaStyle test.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11637 from dongjoon-hyun/SPARK-3854.
      91fed8e9
  21. Mar 03, 2016
    • Dongjoon Hyun's avatar
      [MINOR] Fix typos in comments and testcase name of code · 941b270b
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR fixes typos in comments and testcase name of code.
      
      ## How was this patch tested?
      
      manual.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11481 from dongjoon-hyun/minor_fix_typos_in_code.
      941b270b
    • Dongjoon Hyun's avatar
      [SPARK-13583][CORE][STREAMING] Remove unused imports and add checkstyle rule · b5f02d67
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      After SPARK-6990, `dev/lint-java` keeps Java code healthy and helps PR review by saving much time.
      This issue aims remove unused imports from Java/Scala code and add `UnusedImports` checkstyle rule to help developers.
      
      ## How was this patch tested?
      ```
      ./dev/lint-java
      ./build/sbt compile
      ```
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11438 from dongjoon-hyun/SPARK-13583.
      b5f02d67
  22. Feb 09, 2016
    • Iulian Dragos's avatar
      [SPARK-13086][SHELL] Use the Scala REPL settings, to enable things like `-i file`. · e30121af
      Iulian Dragos authored
      Now:
      
      ```
      $ bin/spark-shell -i test.scala
      NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
      Setting default log level to "WARN".
      To adjust logging level use sc.setLogLevel(newLevel).
      16/01/29 17:37:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      16/01/29 17:37:39 INFO Main: Created spark context..
      Spark context available as sc (master = local[*], app id = local-1454085459000).
      16/01/29 17:37:39 INFO Main: Created sql context..
      SQL context available as sqlContext.
      Loading test.scala...
      hello
      
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 2.0.0-SNAPSHOT
            /_/
      
      Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)
      Type in expressions to have them evaluated.
      Type :help for more information.
      ```
      
      Author: Iulian Dragos <jaguarul@gmail.com>
      
      Closes #10984 from dragos/issue/repl-eval-file.
      e30121af
  23. Jan 30, 2016
    • Josh Rosen's avatar
      [SPARK-6363][BUILD] Make Scala 2.11 the default Scala version · 289373b2
      Josh Rosen authored
      This patch changes Spark's build to make Scala 2.11 the default Scala version. To be clear, this does not mean that Spark will stop supporting Scala 2.10: users will still be able to compile Spark for Scala 2.10 by following the instructions on the "Building Spark" page; however, it does mean that Scala 2.11 will be the default Scala version used by our CI builds (including pull request builds).
      
      The Scala 2.11 compiler is faster than 2.10, so I think we'll be able to look forward to a slight speedup in our CI builds (it looks like it's about 2X faster for the Maven compile-only builds, for instance).
      
      After this patch is merged, I'll update Jenkins to add new compile-only jobs to ensure that Scala 2.10 compilation doesn't break.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #10608 from JoshRosen/SPARK-6363.
      289373b2
  24. Jan 13, 2016
  25. Jan 05, 2016
  26. Dec 31, 2015
  27. Dec 24, 2015
    • Kazuaki Ishizaki's avatar
      [SPARK-12311][CORE] Restore previous value of "os.arch" property in test... · 39204661
      Kazuaki Ishizaki authored
      [SPARK-12311][CORE] Restore previous value of "os.arch" property in test suites after forcing to set specific value to "os.arch" property
      
      Restore the original value of os.arch property after each test
      
      Since some of tests forced to set the specific value to os.arch property, we need to set the original value.
      
      Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
      
      Closes #10289 from kiszk/SPARK-12311.
      39204661
  28. Dec 20, 2015
  29. Dec 19, 2015
  30. Dec 18, 2015
    • Marcelo Vanzin's avatar
      [SPARK-12350][CORE] Don't log errors when requested stream is not found. · 27828182
      Marcelo Vanzin authored
      If a client requests a non-existent stream, just send a failure message
      back, without logging any error on the server side (since it's not a
      server error).
      
      On the executor side, avoid error logs by translating any errors during
      transfer to a `ClassNotFoundException`, so that loading the class is
      retried on a the parent class loader. This can mask IO errors during
      transmission, but the most common cause is that the class is not
      served by the remote end.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #10337 from vanzin/SPARK-12350.
      27828182
  31. Dec 10, 2015
    • Marcelo Vanzin's avatar
      [SPARK-11563][CORE][REPL] Use RpcEnv to transfer REPL-generated classes. · 4a46b885
      Marcelo Vanzin authored
      This avoids bringing up yet another HTTP server on the driver, and
      instead reuses the file server already managed by the driver's
      RpcEnv. As a bonus, the repl now inherits the security features of
      the network library.
      
      There's also a small change to create the directory for storing classes
      under the root temp dir for the application (instead of directly
      under java.io.tmpdir).
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #9923 from vanzin/SPARK-11563.
      4a46b885
    • Jakob Odersky's avatar
      [SPARK-11832][CORE] Process arguments in spark-shell for Scala 2.11 · db516524
      Jakob Odersky authored
      Process arguments passed to the spark-shell. Fixes running the spark-shell from within a build environment.
      
      Author: Jakob Odersky <jodersky@gmail.com>
      
      Closes #9824 from jodersky/shell-2.11.
      db516524
  32. Nov 24, 2015
    • Marcelo Vanzin's avatar
      [SPARK-11929][CORE] Make the repl log4j configuration override the root logger. · e6dd2374
      Marcelo Vanzin authored
      In the default Spark distribution, there are currently two separate
      log4j config files, with different default values for the root logger,
      so that when running the shell you have a different default log level.
      This makes the shell more usable, since the logs don't overwhelm the
      output.
      
      But if you install a custom log4j.properties, you lose that, because
      then it's going to be used no matter whether you're running a regular
      app or the shell.
      
      With this change, the overriding of the log level is done differently;
      the log level repl's main class (org.apache.spark.repl.Main) is used
      to define the root logger's level when running the shell, defaulting
      to WARN if it's not set explicitly.
      
      On a somewhat related change, the shell output about the "sc" variable
      was changed a bit to contain a little more useful information about
      the application, since when the root logger's log level is WARN, that
      information is never shown to the user.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #9816 from vanzin/shell-logging.
      e6dd2374
    • Jungtaek Lim's avatar
      [SPARK-11818][REPL] Fix ExecutorClassLoader to lookup resources from … · be9dd155
      Jungtaek Lim authored
      …parent class loader
      
      Without patch, two additional tests of ExecutorClassLoaderSuite fails.
      
      - "resource from parent"
      - "resources from parent"
      
      Detailed explanation is here, https://issues.apache.org/jira/browse/SPARK-11818?focusedCommentId=15011202&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15011202
      
      Author: Jungtaek Lim <kabhwan@gmail.com>
      
      Closes #9812 from HeartSaVioR/SPARK-11818.
      be9dd155
Loading