Skip to content
Snippets Groups Projects
  1. Jan 28, 2014
    • Josh Rosen's avatar
      Merge pull request #523 from JoshRosen/SPARK-1043 · f8c742ce
      Josh Rosen authored
      Switch from MUTF8 to UTF8 in PySpark serializers.
      
      This fixes SPARK-1043, a bug introduced in 0.9.0 where PySpark couldn't serialize strings > 64kB.
      
      This fix was written by @tyro89 and @bouk in #512. This commit squashes and rebases their pull request in order to fix some merge conflicts.
      f8c742ce
    • Josh Rosen's avatar
      Switch from MUTF8 to UTF8 in PySpark serializers. · 1381fc72
      Josh Rosen authored
      This fixes SPARK-1043, a bug introduced in 0.9.0
      where PySpark couldn't serialize strings > 64kB.
      
      This fix was written by @tyro89 and @bouk in #512.
      This commit squashes and rebases their pull request
      in order to fix some merge conflicts.
      1381fc72
  2. Jan 27, 2014
    • Reynold Xin's avatar
      Merge pull request #466 from liyinan926/file-overwrite-new · 84670f27
      Reynold Xin authored
      Allow files added through SparkContext.addFile() to be overwritten
      
      This is useful for the cases when a file needs to be refreshed and downloaded by the executors periodically. For example, a possible use case is: the driver periodically renews a Hadoop delegation token and writes it to a token file. The token file needs to be downloaded by the executors whenever it gets renewed. However, the current implementation throws an exception when the target file exists and its contents do not match those of the new source. This PR adds an option to allow files to be overwritten to support use cases similar to the above.
      84670f27
    • Reynold Xin's avatar
      Merge pull request #516 from sarutak/master · 3d5c03e2
      Reynold Xin authored
      modified SparkPluginBuild.scala to use https protocol for accessing gith...
      
      We cannot build Spark behind a proxy although we execute sbt with -Dhttp(s).proxyHost -Dhttp(s).proxyPort -Dhttp(s).proxyUser -Dhttp(s).proxyPassword options.
      It's because of using git protocol to clone junit_xml_listener.git.
      I could build after modifying SparkPluginBuild.scala.
      
      I reported this issue to JIRA.
      https://spark-project.atlassian.net/browse/SPARK-1046
      3d5c03e2
    • Reynold Xin's avatar
      Merge pull request #490 from hsaputra/modify_checkoption_with_isdefined · f16c21e2
      Reynold Xin authored
      Replace the check for None Option with isDefined and isEmpty in Scala code
      
      Propose to replace the Scala check for Option "!= None" with Option.isDefined and "=== None" with Option.isEmpty.
      
      I think this, using method call if possible then operator function plus argument, will make the Scala code easier to read and understand.
      
      Pass compile and tests.
      f16c21e2
    • Sean Owen's avatar
      Merge pull request #460 from srowen/RandomInitialALSVectors · f67ce3e2
      Sean Owen authored
      Choose initial user/item vectors uniformly on the unit sphere
      
      ...rather than within the unit square to possibly avoid bias in the initial state and improve convergence.
      
      The current implementation picks the N vector elements uniformly at random from [0,1). This means they all point into one quadrant of the vector space. As N gets just a little large, the vector tend strongly to point into the "corner", towards (1,1,1...,1). The vectors are not unit vectors either.
      
      I suggest choosing the elements as Gaussian ~ N(0,1) and normalizing. This gets you uniform random choices on the unit sphere which is more what's of interest here. It has worked a little better for me in the past.
      
      This is pretty minor but wanted to warm up suggesting a few tweaks to ALS.
      Please excuse my Scala, pretty new to it.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      == Merge branch commits ==
      
      commit 492b13a7469e5a4ed7591ee8e56d8bd7570dfab6
      Author: Sean Owen <sowen@cloudera.com>
      Date:   Mon Jan 27 08:05:25 2014 +0000
      
          Style: spaces around binary operators
      
      commit ce2b5b5a4fefa0356875701f668f01f02ba4d87e
      Author: Sean Owen <sowen@cloudera.com>
      Date:   Sun Jan 19 22:50:03 2014 +0000
      
          Generate factors with all positive components, per discussion in https://github.com/apache/incubator-spark/pull/460
      
      commit b6f7a8a61643a8209e8bc662e8e81f2d15c710c7
      Author: Sean Owen <sowen@cloudera.com>
      Date:   Sat Jan 18 15:54:42 2014 +0000
      
          Choose initial user/item vectors uniformly on the unit sphere rather than within the unit square to possibly avoid bias in the initial state and improve convergence
      f67ce3e2
    • sarutak's avatar
  3. Jan 26, 2014
  4. Jan 25, 2014
    • Josh Rosen's avatar
      Fix ClassCastException in JavaPairRDD.collectAsMap() (SPARK-1040) · 740e865f
      Josh Rosen authored
      This fixes an issue where collectAsMap() could
      fail when called on a JavaPairRDD that was derived
      by transforming a non-JavaPairRDD.
      
      The root problem was that we were creating the
      JavaPairRDD's ClassTag by casting a
      ClassTag[AnyRef] to a ClassTag[Tuple2[K2, V2]].
      To fix this, I cast a ClassTag[Tuple2[_, _]]
      instead, since this actually produces a ClassTag
      of the appropriate type because ClassTags don't
      capture type parameters:
      
      scala> implicitly[ClassTag[Tuple2[_, _]]] == implicitly[ClassTag[Tuple2[Int, Int]]]
      res8: Boolean = true
      
      scala> implicitly[ClassTag[AnyRef]].asInstanceOf[ClassTag[Tuple2[Int, Int]]] == implicitly[ClassTag[Tuple2[Int, Int]]]
      res9: Boolean = false
      740e865f
    • Josh Rosen's avatar
      Increase JUnit test verbosity under SBT. · 531d9d75
      Josh Rosen authored
      Upgrade junit-interface plugin from 0.9 to 0.10.
      
      I noticed that the JavaAPISuite tests didn't
      appear to display any output locally or under
      Jenkins, making it difficult to know whether they
      were running.  This change increases the verbosity
      to more closely match the ScalaTest tests.
      531d9d75
  5. Jan 23, 2014
  6. Jan 22, 2014
Loading