Skip to content
Snippets Groups Projects
  1. Mar 06, 2014
    • Prabin Banka's avatar
      SPARK-1187, Added missing Python APIs · 3d3acef0
      Prabin Banka authored
      The following Python APIs are added,
      RDD.id()
      SparkContext.setJobGroup()
      SparkContext.setLocalProperty()
      SparkContext.getLocalProperty()
      SparkContext.sparkUser()
      
      was raised earlier as a part of  apache/incubator-spark#486
      
      Author: Prabin Banka <prabin.banka@imaginea.com>
      
      Closes #75 from prabinb/python-api-backup and squashes the following commits:
      
      cc3c6cd [Prabin Banka] Added missing Python APIs
      3d3acef0
  2. Feb 20, 2014
    • Ahir Reddy's avatar
      SPARK-1114: Allow PySpark to use existing JVM and Gateway · 59b13795
      Ahir Reddy authored
      Patch to allow PySpark to use existing JVM and Gateway. Changes to PySpark implementation of SparkConf to take existing SparkConf JVM handle. Change to PySpark SparkContext to allow subclass specific context initialization.
      
      Author: Ahir Reddy <ahirreddy@gmail.com>
      
      Closes #622 from ahirreddy/pyspark-existing-jvm and squashes the following commits:
      
      a86f457 [Ahir Reddy] Patch to allow PySpark to use existing JVM and Gateway. Changes to PySpark implementation of SparkConf to take existing SparkConf JVM handle. Change to PySpark SparkContext to allow subclass specific context initialization.
      59b13795
  3. Jan 28, 2014
    • Josh Rosen's avatar
      Switch from MUTF8 to UTF8 in PySpark serializers. · 1381fc72
      Josh Rosen authored
      This fixes SPARK-1043, a bug introduced in 0.9.0
      where PySpark couldn't serialize strings > 64kB.
      
      This fix was written by @tyro89 and @bouk in #512.
      This commit squashes and rebases their pull request
      in order to fix some merge conflicts.
      1381fc72
  4. Jan 01, 2014
  5. Dec 30, 2013
  6. Dec 29, 2013
  7. Dec 28, 2013
  8. Dec 24, 2013
  9. Dec 18, 2013
  10. Nov 10, 2013
  11. Nov 03, 2013
  12. Oct 22, 2013
    • Ewen Cheslack-Postava's avatar
      Pass self to SparkContext._ensure_initialized. · 317a9eb1
      Ewen Cheslack-Postava authored
      The constructor for SparkContext should pass in self so that we track
      the current context and produce errors if another one is created. Add
      a doctest to make sure creating multiple contexts triggers the
      exception.
      317a9eb1
    • Ewen Cheslack-Postava's avatar
      Add classmethod to SparkContext to set system properties. · 56d230e6
      Ewen Cheslack-Postava authored
      Add a new classmethod to SparkContext to set system properties like is
      possible in Scala/Java. Unlike the Java/Scala implementations, there's
      no access to System until the JVM bridge is created. Since
      SparkContext handles that, move the initialization of the JVM
      connection to a separate classmethod that can safely be called
      repeatedly as long as the same instance (or no instance) is provided.
      56d230e6
  13. Sep 08, 2013
  14. Sep 07, 2013
  15. Sep 06, 2013
  16. Sep 01, 2013
  17. Aug 16, 2013
  18. Jul 29, 2013
    • Matei Zaharia's avatar
      SPARK-815. Python parallelize() should split lists before batching · feba7ee5
      Matei Zaharia authored
      One unfortunate consequence of this fix is that we materialize any
      collections that are given to us as generators, but this seems necessary
      to get reasonable behavior on small collections. We could add a
      batchSize parameter later to bypass auto-computation of batch size if
      this becomes a problem (e.g. if users really want to parallelize big
      generators nicely)
      feba7ee5
  19. Jul 16, 2013
  20. Feb 03, 2013
  21. Feb 01, 2013
  22. Jan 23, 2013
  23. Jan 22, 2013
  24. Jan 21, 2013
  25. Jan 20, 2013
  26. Jan 10, 2013
Loading