1. 21 Oct, 2016 1 commit
    • Jagadeesan's avatar
      [SPARK-17960][PYSPARK][UPGRADE TO PY4J 0.10.4] · 595893d3
      Jagadeesan authored
      ## What changes were proposed in this pull request?
      
      1) Upgrade the Py4J version on the Java side
      2) Update the py4j src zip file we bundle with Spark
      
      ## How was this patch tested?
      
      Existing doctests & unit tests pass
      
      Author: Jagadeesan <as2@us.ibm.com>
      
      Closes #15514 from jagadeesanas2/SPARK-17960.
      595893d3
  2. 24 Aug, 2016 1 commit
    • Sean Owen's avatar
      [SPARK-16781][PYSPARK] java launched by PySpark as gateway may not be the same... · 0b3a4be9
      Sean Owen authored
      [SPARK-16781][PYSPARK] java launched by PySpark as gateway may not be the same java used in the spark environment
      
      ## What changes were proposed in this pull request?
      
      Update to py4j 0.10.3 to enable JAVA_HOME support
      
      ## How was this patch tested?
      
      Pyspark tests
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #14748 from srowen/SPARK-16781.
      0b3a4be9
  3. 04 Jun, 2016 1 commit
  4. 13 May, 2016 1 commit
  5. 08 Apr, 2016 1 commit
  6. 28 Mar, 2016 1 commit
    • Herman van Hovell's avatar
      [SPARK-13713][SQL] Migrate parser from ANTLR3 to ANTLR4 · 600c0b69
      Herman van Hovell authored
      ### What changes were proposed in this pull request?
      The current ANTLR3 parser is quite complex to maintain and suffers from code blow-ups. This PR introduces a new parser that is based on ANTLR4.
      
      This parser is based on the [Presto's SQL parser](https://github.com/facebook/presto/blob/master/presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4). The current implementation can parse and create Catalyst and SQL plans. Large parts of the HiveQl DDL and some of the DML functionality is currently missing, the plan is to add this in follow-up PRs.
      
      This PR is a work in progress, and work needs to be done in the following area's:
      
      - [x] Error handling should be improved.
      - [x] Documentation should be improved.
      - [x] Multi-Insert needs to be tested.
      - [ ] Naming and package locations.
      
      ### How was this patch tested?
      
      Catalyst and SQL unit tests.
      
      Author: Herman van Hovell <hvanhovell@questtec.nl>
      
      Closes #11557 from hvanhovell/ngParser.
      600c0b69
  7. 14 Mar, 2016 1 commit
  8. 08 Mar, 2016 1 commit
    • Sean Owen's avatar
      [SPARK-13715][MLLIB] Remove last usages of jblas in tests · 54040f8d
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Remove last usage of jblas, in tests
      
      ## How was this patch tested?
      
      Jenkins tests -- the same ones that are being modified.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #11560 from srowen/SPARK-13715.
      54040f8d
  9. 09 Feb, 2016 1 commit
  10. 29 Jan, 2016 1 commit
  11. 12 Jan, 2016 1 commit
  12. 05 Dec, 2015 1 commit
  13. 04 Nov, 2015 1 commit
  14. 20 Oct, 2015 1 commit
  15. 28 Sep, 2015 1 commit
    • Sean Owen's avatar
      [SPARK-10833] [BUILD] Inline, organize BSD/MIT licenses in LICENSE · bf4199e2
      Sean Owen authored
      In the course of https://issues.apache.org/jira/browse/LEGAL-226 it came to light that the guidance at http://www.apache.org/dev/licensing-howto.html#permissive-deps means that permissively-licensed dependencies has a different interpretation than we (er, I) had been operating under. "pointer ... to the license within the source tree" specifically means a copy of the license within Spark's distribution, whereas at the moment, Spark's LICENSE has a pointer to the project's license in the other project's source tree.
      
      The remedy is simply to inline all such license references (i.e. BSD/MIT licenses) or include their text in "licenses" subdirectory and point to that.
      
      Along the way, we can also treat other BSD/MIT licenses, whose text has been inlined into LICENSE, in the same way.
      
      The LICENSE file can continue to provide a helpful list of BSD/MIT licensed projects and a pointer to their sites. This would be over and above including license ...
      bf4199e2
  16. 29 Jun, 2015 1 commit
    • Josh Rosen's avatar
      [SPARK-8709] Exclude hadoop-client's mockito-all dependency · 27ef8545
      Josh Rosen authored
      This patch excludes `hadoop-client`'s dependency on `mockito-all`.  As of #7061, Spark depends on `mockito-core` instead of `mockito-all`, so the dependency from Hadoop was leading to test compilation failures for some of the Hadoop 2 SBT builds.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #7090 from JoshRosen/SPARK-8709 and squashes the following commits:
      
      e190122 [Josh Rosen] [SPARK-8709] Exclude hadoop-client's mockito-all dependency.
      27ef8545
  17. 28 Jun, 2015 1 commit
    • Josh Rosen's avatar
      [SPARK-8683] [BUILD] Depend on mockito-core instead of mockito-all · f5100451
      Josh Rosen authored
      Spark's tests currently depend on `mockito-all`, which bundles Hamcrest and Objenesis classes. Instead, it should depend on `mockito-core`, which declares those libraries as Maven dependencies. This is necessary in order to fix a dependency conflict that leads to a NoSuchMethodError when using certain Hamcrest matchers.
      
      See https://github.com/mockito/mockito/wiki/Declaring-mockito-dependency for more details.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #7061 from JoshRosen/mockito-core-instead-of-all and squashes the following commits:
      
      70eccbe [Josh Rosen] Depend on mockito-core instead of mockito-all.
      f5100451
  18. 18 Jun, 2015 1 commit
  19. 31 May, 2015 1 commit
    • zsxwing's avatar
      [MINOR] Add license for dagre-d3 and graphlib-dot · d1d2def2
      zsxwing authored
      Add license for dagre-d3 and graphlib-dot
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #6539 from zsxwing/LICENSE and squashes the following commits:
      
      82b0475 [zsxwing] Add license for dagre-d3 and graphlib-dot
      d1d2def2
  20. 16 May, 2015 1 commit
    • Matthew Brandyberry's avatar
      [BUILD] update jblas dependency version to 1.2.4 · 1b4e710e
      Matthew Brandyberry authored
      jblas 1.2.4 includes native library support for PPC64LE.
      
      Author: Matthew Brandyberry <mbrandy@us.ibm.com>
      
      Closes #6199 from mtbrandy/jblas-1.2.4 and squashes the following commits:
      
      9df9301 [Matthew Brandyberry] [BUILD] update jblas dependency version to 1.2.4
      1b4e710e
  21. 09 May, 2015 1 commit
    • Kousuke Saruta's avatar
      [SPARK-7403] [WEBUI] Link URL in objects on Timeline View is wrong in case of running on YARN · 12b95abc
      Kousuke Saruta authored
      When we use Spark on YARN and have AllJobPage via ResourceManager's proxy, the link URL in objects which represent each job on timeline view is wrong.
      
      In timeline-view.js, the link is generated as follows.
      ```
      window.location.href = "job/?id=" + getJobId(this);
      ```
      
      This assumes the URL displayed on the web browser ends with "jobs/" but when we access AllJobPage via the proxy, the url displayed does not end with "jobs/"
      
      The proxy doesn't return status code 301 or 302 so the url displayed still indicates the base url, not "/jobs" even though displaying AllJobPages.
      
      ![2015-05-07 3 34 37](https://cloud.githubusercontent.com/assets/4736016/7501079/a8507ad6-f46c-11e4-9bed-62abea170f4c.png)
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #5947 from sarutak/fix-link-in-timeline and squashes the following commits:
      
      aaf40e1 [Kousuke Saruta] Added Copyright for vis.js
      01bee7b [Kousuke Saruta] Fixed timeline-view.js in order to get correct href
      12b95abc
  22. 05 May, 2015 1 commit
    • zsxwing's avatar
      [SPARK-6939] [STREAMING] [WEBUI] Add timeline and histogram graphs for streaming statistics · 489700c8
      zsxwing authored
      This is the initial work of SPARK-6939. Not yet ready for code review. Here are the screenshots:
      
      ![graph1](https://cloud.githubusercontent.com/assets/1000778/7165766/465942e0-e3dc-11e4-9b05-c184b09d75dc.png)
      
      ![graph2](https://cloud.githubusercontent.com/assets/1000778/7165779/53f13f34-e3dc-11e4-8714-a4a75b7e09ff.png)
      
      TODOs:
      - [x] Display more information on mouse hover
      - [x] Align the timeline and distribution graphs
      - [x] Clean up the codes
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #5533 from zsxwing/SPARK-6939 and squashes the following commits:
      
      9f7cd19 [zsxwing] Merge branch 'master' into SPARK-6939
      deacc3f [zsxwing] Remove unused import
      cd03424 [zsxwing] Fix .rat-excludes
      70cc87d [zsxwing] Streaming Scheduling Delay => Scheduling Delay
      d457277 [zsxwing] Fix UIUtils in BatchPage
      b3f303e [zsxwing] Add comments for unclear classes and methods
      ff0bff8 [zsxwing] Make InputDStream.name private[streaming]
      cc392c5 [zsxwing] Merge branch 'master' into SPARK-6939
      e275e23 [zsxwing] Move time related methods to Streaming's UIUtils
      d5d86f6 [zsxwing] Fix incorrect lastErrorTime
      3be4b7a [zsxwing] Use InputInfo
      b50fa32 [zsxwing] Jump to the batch page when clicking a point in the timeline graphs
      203605d [zsxwing] Merge branch 'master' into SPARK-6939
      74307cf [zsxwing] Reuse the data for histogram graphs to reduce the page size
      2586916 [zsxwing] Merge branch 'master' into SPARK-6939
      70d8533 [zsxwing] Remove BatchInfo.numRecords and a few renames
      7bbdc0a [zsxwing] Hide the receiver sub table if no receiver
      a2972e9 [zsxwing] Add some ui tests for StreamingPage
      fd03ad0 [zsxwing] Add a test to verify no memory leak
      4a8f886 [zsxwing] Merge branch 'master' into SPARK-6939
      18607a1 [zsxwing] Merge branch 'master' into SPARK-6939
      d0b0aec [zsxwing] Clean up the codes
      a459f49 [zsxwing] Add a dash line to processing time graphs
      8e4363c [zsxwing] Prepare for the demo
      c81a1ee [zsxwing] Change time unit in the graphs automatically
      4c0b43f [zsxwing] Update Streaming UI
      04c7500 [zsxwing] Make the server and client use the same timezone
      fed8219 [zsxwing] Move the x axis at the top and show a better tooltip
      c23ce10 [zsxwing] Make two graphs close
      d78672a [zsxwing] Make the X axis use the same range
      881c907 [zsxwing] Use histogram for distribution
      5688702 [zsxwing] Fix the unit test
      ddf741a [zsxwing] Fix the unit test
      ad93295 [zsxwing] Remove unnecessary codes
      a0458f9 [zsxwing] Clean the codes
      b82ed1e [zsxwing] Update the graphs as per comments
      dd653a1 [zsxwing] Add timeline and histogram graphs for streaming statistics
      489700c8
  23. 30 Apr, 2015 1 commit
    • Vincenzo Selvaggio's avatar
      [SPARK-1406] Mllib pmml model export · 254e0509
      Vincenzo Selvaggio authored
      See PDF attached to the JIRA issue 1406.
      
      The contribution is my original work and I license the work to the project under the project's open source license.
      
      Author: Vincenzo Selvaggio <vselvaggio@hotmail.it>
      Author: Xiangrui Meng <meng@databricks.com>
      Author: selvinsource <vselvaggio@hotmail.it>
      
      Closes #3062 from selvinsource/mllib_pmml_model_export_SPARK-1406 and squashes the following commits:
      
      852aac6 [Vincenzo Selvaggio] [SPARK-1406] Update JPMML version to 1.1.15 in LICENSE file
      085cf42 [Vincenzo Selvaggio] [SPARK-1406] Added Double Min and Max Fixed scala style
      30165c4 [Vincenzo Selvaggio] [SPARK-1406] Fixed extreme cases for logit
      7a5e0ec [Vincenzo Selvaggio] [SPARK-1406] Binary classification for SVM and Logistic Regression
      cfcb596 [Vincenzo Selvaggio] [SPARK-1406] Throw IllegalArgumentException when exporting a multinomial logistic regression
      25dce33 [Vincenzo Selvaggio] [SPARK-1406] Update code to latest pmml model
      dea98ca [Vincenzo Selvaggio] [SPARK-1406] Exclude transitive dependency for pmml model
      66b7c12 [Vincenzo Selvaggio] [SPARK-1406] Updated pmml model lib to 1.1.15, latest Java 6 compatible
      a0a55f7 [Vincenzo Selvaggio] Merge pull request #2 from mengxr/SPARK-1406
      3c22f79 [Xiangrui Meng] more code style
      e2313df [Vincenzo Selvaggio] Merge pull request #1 from mengxr/SPARK-1406
      472d757 [Xiangrui Meng] fix code style
      1676e15 [Vincenzo Selvaggio] fixed scala issue
      e2ffae8 [Vincenzo Selvaggio] fixed scala style
      b8823b0 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406
      b25bbf7 [Vincenzo Selvaggio] [SPARK-1406] Added export of pmml to distributed file system using the spark context
      7a949d0 [Vincenzo Selvaggio] [SPARK-1406] Fixed scala style
      f46c75c [Vincenzo Selvaggio] [SPARK-1406] Added PMMLExportable to supported models
      7b33b4e [Vincenzo Selvaggio] [SPARK-1406] Added a PMMLExportable interface Restructured code in a new package mllib.pmml Supported models implements the new PMMLExportable interface: LogisticRegression, SVM, KMeansModel, LinearRegression, RidgeRegression, Lasso
      d559ec5 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406
      8fe12bb [Vincenzo Selvaggio] [SPARK-1406] Adjusted logistic regression export description and target categories
      03bc3a5 [Vincenzo Selvaggio] added logistic regression
      da2ec11 [Vincenzo Selvaggio] [SPARK-1406] added linear SVM PMML export
      82f2131 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406
      19adf29 [Vincenzo Selvaggio] [SPARK-1406] Fixed scala style
      1faf985 [Vincenzo Selvaggio] [SPARK-1406] Added target field to the regression model for completeness Adjusted unit test to deal with this change
      3ae8ae5 [Vincenzo Selvaggio] [SPARK-1406] Adjusted imported order according to the guidelines
      c67ce81 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406
      78515ec [Vincenzo Selvaggio] [SPARK-1406] added pmml export for LinearRegressionModel, RidgeRegressionModel and LassoModel
      e29dfb9 [Vincenzo Selvaggio] removed version, by default is set to 4.2 (latest from jpmml) removed copyright
      ae8b993 [Vincenzo Selvaggio] updated some commented tests to use the new ModelExporter object reordered the imports
      df8a89e [Vincenzo Selvaggio] added pmml version to pmml model changed the copyright to spark
      a1b4dc3 [Vincenzo Selvaggio] updated imports
      834ca44 [Vincenzo Selvaggio] reordered the import accordingly to the guidelines
      349a76b [Vincenzo Selvaggio] new helper object to serialize the models to pmml format
      c3ef9b8 [Vincenzo Selvaggio] set it to private
      6357b98 [Vincenzo Selvaggio] set it to private
      e1eb251 [Vincenzo Selvaggio] removed serialization part, this will be part of the ModelExporter helper object
      aba5ee1 [Vincenzo Selvaggio] fixed cluster export
      cd6c07c [Vincenzo Selvaggio] fixed scala style to run tests
      f75b988 [Vincenzo Selvaggio] Merge remote-tracking branch 'origin/master' into mllib_pmml_model_export_SPARK-1406
      07a29bf [selvinsource] Update LICENSE
      8841439 [Vincenzo Selvaggio] adjust scala style in order to compile
      1433b11 [Vincenzo Selvaggio] complete suite tests
      8e71b8d [Vincenzo Selvaggio] kmeans pmml export implementation
      9bc494f [Vincenzo Selvaggio] added scala suite tests added saveLocalFile to ModelExport trait
      226e184 [Vincenzo Selvaggio] added javadoc and export model type in case there is a need to support other types of export (not just PMML)
      a0e3679 [Vincenzo Selvaggio] export and pmml export traits kmeans test implementation
      254e0509
  24. 28 Feb, 2015 1 commit
  25. 08 Dec, 2014 1 commit
    • Sean Owen's avatar
      SPARK-3926 [CORE] Reopened: result of JavaRDD collectAsMap() is not serializable · e829bfa1
      Sean Owen authored
      My original 'fix' didn't fix at all. Now, there's a unit test to check whether it works. Of the two options to really fix it -- copy the `Map` to a `java.util.HashMap`, or copy and modify Scala's implementation in `Wrappers.MapWrapper`, I went with the latter.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3587 from srowen/SPARK-3926 and squashes the following commits:
      
      8586bb9 [Sean Owen] Remove unneeded no-arg constructor, and add additional note about copied code in LICENSE
      7bb0e66 [Sean Owen] Make SerializableMapWrapper actually serialize, and add unit test
      e829bfa1
  26. 05 Nov, 2014 1 commit
    • Aaron Davidson's avatar
      [SPARK-4242] [Core] Add SASL to external shuffle service · 4c42986c
      Aaron Davidson authored
      Does three things: (1) Adds SASL to ExternalShuffleClient, (2) puts SecurityManager in BlockManager's constructor, and (3) adds unit test.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #3108 from aarondav/sasl-client and squashes the following commits:
      
      48b622d [Aaron Davidson] Screw it, let's just get LimitedInputStream
      3543b70 [Aaron Davidson] Back out of pom change due to unknown test issue?
      b58518a [Aaron Davidson] ByteStreams.limit() not available :(
      cbe451a [Aaron Davidson] Address comments
      2bf2908 [Aaron Davidson] [SPARK-4242] [Core] Add SASL to external shuffle service
      4c42986c
  27. 27 Oct, 2014 1 commit
    • Sean Owen's avatar
      SPARK-4022 [CORE] [MLLIB] Replace colt dependency (LGPL) with commons-math · bfa614b1
      Sean Owen authored
      This change replaces usages of colt with commons-math3 equivalents, and makes some minor necessary adjustments to related code and tests to match.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #2928 from srowen/SPARK-4022 and squashes the following commits:
      
      61a232f [Sean Owen] Fix failure due to different sampling in JavaAPISuite.sample()
      16d66b8 [Sean Owen] Simplify seeding with call to reseedRandomGenerator
      a1a78e0 [Sean Owen] Use Well19937c
      31c7641 [Sean Owen] Fix Python Poisson test by choosing a different seed; about 88% of seeds should work but 1 didn't, it seems
      5c9c67f [Sean Owen] Additional test fixes from review
      d8f88e0 [Sean Owen] Replace colt with commons-math3. Some tests do not pass yet.
      bfa614b1
  28. 26 Aug, 2014 1 commit
    • Davies Liu's avatar
      [SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey() · f1e71d4c
      Davies Liu authored
      Using external sort to support sort large datasets in reduce stage.
      
      Author: Davies Liu <davies.liu@gmail.com>
      
      Closes #1978 from davies/sort and squashes the following commits:
      
      bbcd9ba [Davies Liu] check spilled bytes in tests
      b125d2f [Davies Liu] add test for external sort in rdd
      eae0176 [Davies Liu] choose different disks from different processes and instances
      1f075ed [Davies Liu] Merge branch 'master' into sort
      eb53ca6 [Davies Liu] Merge branch 'master' into sort
      644abaf [Davies Liu] add license in LICENSE
      19f7873 [Davies Liu] improve tests
      55602ee [Davies Liu] use external sort in sortBy() and sortByKey()
      f1e71d4c
  29. 02 Aug, 2014 1 commit
  30. 29 Jul, 2014 1 commit
  31. 22 Jul, 2014 1 commit
    • Aaron Davidson's avatar
      SPARK-2047: Introduce an in-mem Sorter, and use it to reduce mem usage · 85d3596e
      Aaron Davidson authored
      ### Why and what?
      Currently, the AppendOnlyMap performs an "in-place" sort by converting its array of [key, value, key, value] pairs into a an array of [(key, value), (key, value)] pairs. However, this causes us to allocate many Tuple2 objects, which come at a nontrivial overhead.
      
      This patch adds a Sorter API, intended for in memory sorts, which simply ports the Android Timsort implementation (available under Apache v2) and abstracts the interface in a way which introduces no more than 1 virtual function invocation of overhead at each abstraction point.
      
      Please compare our port of the Android Timsort sort with the original implementation: http://www.diffchecker.com/wiwrykcl
      
      ### Memory implications
      An AppendOnlyMap contains N kv pairs, which results in roughly 2N elements within its underlying array. Each of these elements is 4 bytes wide in a [compressed OOPS](https://wikis.oracle.com/display/HotSpotInternals/CompressedOops) system, which is the default.
      
      Today's approach immediately allocates N Tuple2 objects, which take up 24N bytes in total (exposed via YourKit), and undergoes a Java sort. The Java 6 version immediately copies the entire array (4N bytes here), while the Java 7 version has a worst-case allocation of half the array (2N bytes).
      This results in a worst-case sorting overhead of 24N + 2N = 26N bytes (for Java 7).
      
      The Sorter does not require allocating any tuples, but since it uses Timsort, it may copy up to half the entire array in the worst case.
      This results in a worst-case sorting overhead of 4N bytes.
      
      Thus, we have reduced the worst-case overhead of the sort by roughly 22 bytes times the number of elements.
      
      ### Performance implications
      As the destructiveSortedIterator is used for spilling in an ExternalAppendOnlyMap, the purpose of this patch is to provide stability by reducing memory usage rather than improve performance. However, because it implements Timsort, it also brings a substantial performance boost over our prior implementation.
      
      Here are the results of a microbenchmark that sorted 25 million, randomly distributed (Float, Int) pairs. The Java Arrays.sort() tests were run **only on the keys**, and thus moved less data. Our current implementation is called "Tuple-sort using Arrays.sort()" while the new implementation is "KV-array using Sorter".
      
      <table>
      <tr><th>Test</th><th>First run (JDK6)</th><th>Average of 10 (JDK6)</th><th>First run (JDK7)</th><th>Average of 10 (JDK7)</th></tr>
      <tr><td>primitive Arrays.sort()</td><td>3216 ms</td><td>1190 ms</td><td>2724 ms</td><td>131 ms (!!)</td></tr>
      <tr><td>Arrays.sort()</td><td>18564 ms</td><td>2006 ms</td><td>13201 ms</td><td>878 ms</td></tr>
      <tr><td>Tuple-sort using Arrays.sort()</td><td>31813 ms</td><td>3550 ms</td><td>20990 ms</td><td>1919 ms</td></tr>
      <tr><td><b>KV-array using Sorter</b></td><td></td><td></td><td><b>15020 ms</b></td><td><b>834 ms</b></td></tr>
      </table>
      
      The results show that this Sorter performs exactly as expected (after the first run) -- it is as fast as the Java 7 Arrays.sort() (which shares the same algorithm), but is significantly faster than the Tuple-sort on Java 6 or 7.
      
      In short, this patch should significantly improve performance for users running either Java 6 or 7.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #1502 from aarondav/sort and squashes the following commits:
      
      652d936 [Aaron Davidson] Update license, move Sorter to java src
      a7b5b1c [Aaron Davidson] fix licenses
      5c0efaf [Aaron Davidson] Update tmpLength
      ec395c8 [Aaron Davidson] Ignore benchmark (again) and fix docs
      034bf10 [Aaron Davidson] Change to Apache v2 Timsort
      b97296c [Aaron Davidson] Don't try to run benchmark on Jenkins + private[spark]
      6307338 [Aaron Davidson] SPARK-2047: Introduce an in-mem Sorter, and use it to reduce mem usage
      85d3596e
  32. 14 May, 2014 1 commit
    • Sean Owen's avatar
      SPARK-1827. LICENSE and NOTICE files need a refresh to contain transitive dependency info · 2e5a7cde
      Sean Owen authored
      LICENSE and NOTICE policy is explained here:
      
      http://www.apache.org/dev/licensing-howto.html
      http://www.apache.org/legal/3party.html
      
      This leads to the following changes.
      
      First, this change enables two extensions to maven-shade-plugin in assembly/ that will try to include and merge all NOTICE and LICENSE files. This can't hurt.
      
      This generates a consolidated NOTICE file that I manually added to NOTICE.
      
      Next, a list of all dependencies and their licenses was generated:
      `mvn ... license:aggregate-add-third-party`
      to create: `target/generated-sources/license/THIRD-PARTY.txt`
      
      Each dependency is listed with one or more licenses. Determine the most-compatible license for each if there is more than one.
      
      For "unknown" license dependencies, I manually evaluateD their license. Many are actually Apache projects or components of projects covered already. The only non-trivial one was Colt, which has its own (compatible) license.
      
      I ignored Apache-licensed and public domain dependencies as these require no further action (beyond NOTICE above).
      
      BSD and MIT licenses (permissive Category A licenses) are evidently supposed to be mentioned in LICENSE, so I added a section without output from the THIRD-PARTY.txt file appropriately.
      
      Everything else, Category B licenses, are evidently mentioned in NOTICE (?) Same there.
      
      LICENSE contained some license statements for source code that is redistributed. I left this as I think that is the right place to put it.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #770 from srowen/SPARK-1827 and squashes the following commits:
      
      a764504 [Sean Owen] Add LICENSE and NOTICE info for all transitive dependencies as of 1.0
      2e5a7cde
  33. 02 Mar, 2014 1 commit
    • Michael Armbrust's avatar
      Merge the old sbt-launch-lib.bash with the new sbt-launcher jar downloading logic. · 012bd5fb
      Michael Armbrust authored
      This allows developers to pass options (such as -D) to sbt.  I also modified the SparkBuild to ensure spark specific properties are propagated to forked test JVMs.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #14 from marmbrus/sbtScripts and squashes the following commits:
      
      c008b18 [Michael Armbrust] Merge the old sbt-launch-lib.bash with the new sbt-launcher jar downloading logic.
      012bd5fb
  34. 02 Sep, 2013 1 commit
  35. 16 Jul, 2013 1 commit
  36. 07 Dec, 2010 1 commit