Skip to content
Snippets Groups Projects
  1. Jun 21, 2014
  2. Jun 12, 2014
    • Andrew Or's avatar
      [Minor] Fix style, formatting and naming in BlockManager etc. · 44daec5a
      Andrew Or authored
      This is a precursor to a bigger change. I wanted to separate out the relatively insignificant changes so the ultimate PR is not inflated.
      
      (Warning: this PR is full of unimportant nitpicks)
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #1058 from andrewor14/bm-minor and squashes the following commits:
      
      8e12eaf [Andrew Or] SparkException -> BlockException
      c36fd53 [Andrew Or] Make parts of BlockManager more readable
      0a5f378 [Andrew Or] Entry -> MemoryEntry
      e9762a5 [Andrew Or] Tone down string interpolation (minor reverts)
      c4de9ac [Andrew Or] Merge branch 'master' of github.com:apache/spark into bm-minor
      b3470f1 [Andrew Or] More string interpolation (minor)
      7f9dcab [Andrew Or] Use string interpolation (minor)
      94a425b [Andrew Or] Refactor against duplicate code + minor changes
      8a6a7dc [Andrew Or] Exception -> SparkException
      97c410f [Andrew Or] Deal with MIMA excludes
      2480f1d [Andrew Or] Fixes in StorgeLevel.scala
      abb0163 [Andrew Or] Style, formatting and naming fixes
      44daec5a
    • Sandy Ryza's avatar
      SPARK-554. Add aggregateByKey. · ce92a9c1
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #705 from sryza/sandy-spark-554 and squashes the following commits:
      
      2302b8f [Sandy Ryza] Add MIMA exclude
      f52e0ad [Sandy Ryza] Fix Python tests for real
      2f3afa3 [Sandy Ryza] Fix Python test
      0b735e9 [Sandy Ryza] Fix line lengths
      ae56746 [Sandy Ryza] Fix doc (replace T with V)
      c2be415 [Sandy Ryza] Java and Python aggregateByKey
      23bf400 [Sandy Ryza] SPARK-554.  Add aggregateByKey.
      ce92a9c1
  3. Jun 11, 2014
    • Tor Myklebust's avatar
      [SPARK-1672][MLLIB] Separate user and product partitioning in ALS · d9203350
      Tor Myklebust authored
      Some clean up work following #593.
      
      1. Allow to set different number user blocks and number product blocks in `ALS`.
      2. Update `MovieLensALS` to reflect the change.
      
      Author: Tor Myklebust <tmyklebu@gmail.com>
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #1014 from mengxr/SPARK-1672 and squashes the following commits:
      
      0e910dd [Xiangrui Meng] change private[this] to private[recommendation]
      36420c7 [Xiangrui Meng] set exclusion rules for ALS
      9128b77 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-1672
      294efe9 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-1672
      9bab77b [Xiangrui Meng] clean up add numUserBlocks and numProductBlocks to MovieLensALS
      84c8e8c [Xiangrui Meng] Merge branch 'master' into SPARK-1672
      d17a8bf [Xiangrui Meng] merge master
      a4925fd [Tor Myklebust] Style.
      bd8a75c [Tor Myklebust] Merge branch 'master' of github.com:apache/spark into alsseppar
      021f54b [Tor Myklebust] Separate user and product blocks.
      dcf583a [Tor Myklebust] Remove the partitioner member variable; instead, thread that needle everywhere it needs to go.
      23d6f91 [Tor Myklebust] Stop making the partitioner configurable.
      495784f [Tor Myklebust] Merge branch 'master' of https://github.com/apache/spark
      674933a [Tor Myklebust] Fix style.
      40edc23 [Tor Myklebust] Fix missing space.
      f841345 [Tor Myklebust] Fix daft bug creating 'pairs', also for -> foreach.
      5ec9e6c [Tor Myklebust] Clean a couple of things up using 'map'.
      36a0f43 [Tor Myklebust] Make the partitioner private.
      d872b09 [Tor Myklebust] Add negative id ALS test.
      df27697 [Tor Myklebust] Support custom partitioners.  Currently we use the same partitioner for users and products.
      c90b6d8 [Tor Myklebust] Scramble user and product ids before bucketing.
      c774d7d [Tor Myklebust] Make the partitioner a member variable and use it instead of modding directly.
      d9203350
  4. Jun 04, 2014
    • Kan Zhang's avatar
      [SPARK-1817] RDD.zip() should verify partition sizes for each partition · c402a4a6
      Kan Zhang authored
      RDD.zip() will throw an exception if it finds partition sizes are not the same.
      
      Author: Kan Zhang <kzhang@apache.org>
      
      Closes #944 from kanzhang/SPARK-1817 and squashes the following commits:
      
      c073848 [Kan Zhang] [SPARK-1817] Cosmetic updates
      524c670 [Kan Zhang] [SPARK-1817] RDD.zip() should verify partition sizes for each partition
      c402a4a6
  5. Jun 03, 2014
    • Reynold Xin's avatar
      SPARK-1941: Update streamlib to 2.7.0 and use HyperLogLogPlus instead of HyperLogLog. · 1faef149
      Reynold Xin authored
      I also corrected some errors made in the previous HLL count approximate API, including relativeSD wasn't really a measure for error (and we used it to test error bounds in test results).
      
      Author: Reynold Xin <rxin@apache.org>
      
      Closes #897 from rxin/hll and squashes the following commits:
      
      4d83f41 [Reynold Xin] New error bound and non-randomness.
      f154ea0 [Reynold Xin] Added a comment on the value bound for testing.
      e367527 [Reynold Xin] One more round of code review.
      41e649a [Reynold Xin] Update final mima list.
      9e320c8 [Reynold Xin] Incorporate code review feedback.
      e110d70 [Reynold Xin] Merge branch 'master' into hll
      354deb8 [Reynold Xin] Added comment on the Mima exclude rules.
      acaa524 [Reynold Xin] Added the right exclude rules in MimaExcludes.
      6555bfe [Reynold Xin] Added a default method and re-arranged MimaExcludes.
      1db1522 [Reynold Xin] Excluded util.SerializableHyperLogLog from MIMA check.
      9221b27 [Reynold Xin] Merge branch 'master' into hll
      88cfe77 [Reynold Xin] Updated documentation and restored the old incorrect API to maintain API compatibility.
      1294be6 [Reynold Xin] Updated HLL+.
      e7786cb [Reynold Xin] Merge branch 'master' into hll
      c0ef0c2 [Reynold Xin] SPARK-1941: Update streamlib to 2.7.0 and use HyperLogLogPlus instead of HyperLogLog.
      1faef149
    • Joseph E. Gonzalez's avatar
      Synthetic GraphX Benchmark · 894ecde0
      Joseph E. Gonzalez authored
      This PR accomplishes two things:
      
      1. It introduces a Synthetic Benchmark application that generates an arbitrarily large log-normal graph and executes either PageRank or connected components on the graph.  This can be used to profile GraphX system on arbitrary clusters without access to large graph datasets
      
      2. This PR improves the implementation of the log-normal graph generator.
      
      Author: Joseph E. Gonzalez <joseph.e.gonzalez@gmail.com>
      Author: Ankur Dave <ankurdave@gmail.com>
      
      Closes #720 from jegonzal/graphx_synth_benchmark and squashes the following commits:
      
      e40812a [Ankur Dave] Exclude all of GraphX from compatibility checks vs. 1.0.0
      bccccad [Ankur Dave] Fix long lines
      374678a [Ankur Dave] Bugfix and style changes
      1bdf39a [Joseph E. Gonzalez] updating options
      d943972 [Joseph E. Gonzalez] moving the benchmark application into the examples folder.
      f4f839a [Joseph E. Gonzalez] Creating a synthetic benchmark script.
      894ecde0
  6. Jun 01, 2014
    • Patrick Wendell's avatar
      Better explanation for how to use MIMA excludes. · d17d2214
      Patrick Wendell authored
      This patch does a few things:
      1. We have a file MimaExcludes.scala exclusively for excludes.
      2. The test runner tells users about that file if a test fails.
      3. I've added back the excludes used from 0.9->1.0. We should keep
         these in the project as an official audit trail of times where
         we decided to make exceptions.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #937 from pwendell/mima and squashes the following commits:
      
      7ee0db2 [Patrick Wendell] Better explanation for how to use MIMA excludes.
      d17d2214
Loading