Skip to content
Snippets Groups Projects
  1. Sep 15, 2014
    • Prashant Sharma's avatar
      [SPARK-3433][BUILD] Fix for Mima false-positives with @DeveloperAPI and @Experimental annotations. · ecf0c029
      Prashant Sharma authored
      Actually false positive reported was due to mima generator not picking up the new jars in presence of old jars(theoretically this should not have happened.). So as a workaround, ran them both separately and just append them together.
      
      Author: Prashant Sharma <prashant@apache.org>
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #2285 from ScrapCodes/mima-fix and squashes the following commits:
      
      093c76f [Prashant Sharma] Update mima
      59012a8 [Prashant Sharma] Update mima
      35b6c71 [Prashant Sharma] SPARK-3433 Fix for Mima false-positives with @DeveloperAPI and @Experimental annotations.
      ecf0c029
  2. Sep 07, 2014
    • Josh Rosen's avatar
      [HOTFIX] Fix broken Mima tests on the master branch · 4ba26735
      Josh Rosen authored
      By merging #2268, which bumped the Spark version to 1.2.0-SNAPSHOT, I inadvertently broke the Mima binary compatibility tests.  The issue is that we were comparing 1.2.0-SNAPSHOT against Spark 1.0.0 without using any Mima excludes.  The right long-term fix for this is probably to publish nightly snapshots on Maven central and change the master branch to test binary compatibility against the current release candidate branch's snapshots until that release is finalized.
      
      As a short-term fix until 1.1.0 is published on Maven central, I've configured the build to test the master branch for binary compatibility against the 1.1.0-RC4 jars.  I'll loop back and remove the Apache staging repo as soon as 1.1.0 final is available.
      
      Author: Josh Rosen <joshrosen@apache.org>
      
      Closes #2315 from JoshRosen/mima-fix and squashes the following commits:
      
      776bc2c [Josh Rosen] Add two excludes to workaround Mima annotation issues.
      ec90e21 [Josh Rosen] Add deploy and graphx to 1.2 MiMa excludes.
      57569be [Josh Rosen] Fix MiMa tests in master branch; test against 1.1.0 RC.
      4ba26735
  3. Jul 10, 2014
    • Prashant Sharma's avatar
      [SPARK-1776] Have Spark's SBT build read dependencies from Maven. · 628932b8
      Prashant Sharma authored
      Patch introduces the new way of working also retaining the existing ways of doing things.
      
      For example build instruction for yarn in maven is
      `mvn -Pyarn -PHadoop2.2 clean package -DskipTests`
      in sbt it can become
      `MAVEN_PROFILES="yarn, hadoop-2.2" sbt/sbt clean assembly`
      Also supports
      `sbt/sbt -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 clean assembly`
      
      Author: Prashant Sharma <prashant.s@imaginea.com>
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #772 from ScrapCodes/sbt-maven and squashes the following commits:
      
      a8ac951 [Prashant Sharma] Updated sbt version.
      62b09bb [Prashant Sharma] Improvements.
      fa6221d [Prashant Sharma] Excluding sql from mima
      4b8875e [Prashant Sharma] Sbt assembly no longer builds tools by default.
      72651ca [Prashant Sharma] Addresses code reivew comments.
      acab73d [Prashant Sharma] Revert "Small fix to run-examples script."
      ac4312c [Prashant Sharma] Revert "minor fix"
      6af91ac [Prashant Sharma] Ported oldDeps back. + fixes issues with prev commit.
      65cf06c [Prashant Sharma] Servelet API jars mess up with the other servlet jars on the class path.
      446768e [Prashant Sharma] minor fix
      89b9777 [Prashant Sharma] Merge conflicts
      d0a02f2 [Prashant Sharma] Bumped up pom versions, Since the build now depends on pom it is better updated there. + general cleanups.
      dccc8ac [Prashant Sharma] updated mima to check against 1.0
      a49c61b [Prashant Sharma] Fix for tools jar
      a2f5ae1 [Prashant Sharma] Fixes a bug in dependencies.
      cf88758 [Prashant Sharma] cleanup
      9439ea3 [Prashant Sharma] Small fix to run-examples script.
      96cea1f [Prashant Sharma] SPARK-1776 Have Spark's SBT build read dependencies from Maven.
      36efa62 [Patrick Wendell] Set project name in pom files and added eclipse/intellij plugins.
      4973dbd [Patrick Wendell] Example build using pom reader.
      628932b8
  4. Jun 11, 2014
    • Prashant Sharma's avatar
      [SPARK-2069] MIMA false positives · 5b754b45
      Prashant Sharma authored
      Fixes SPARK 2070 and 2071
      
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #1021 from ScrapCodes/SPARK-2070/package-private-methods and squashes the following commits:
      
      7979a57 [Prashant Sharma] addressed code review comments
      558546d [Prashant Sharma] A little fancy error message.
      59275ab [Prashant Sharma] SPARK-2071 Mima ignores classes and its members from previous versions too.
      0c4ff2b [Prashant Sharma] SPARK-2070 Ignore methods along with annotated classes.
      5b754b45
  5. Jun 01, 2014
    • Patrick Wendell's avatar
      Better explanation for how to use MIMA excludes. · d17d2214
      Patrick Wendell authored
      This patch does a few things:
      1. We have a file MimaExcludes.scala exclusively for excludes.
      2. The test runner tells users about that file if a test fails.
      3. I've added back the excludes used from 0.9->1.0. We should keep
         these in the project as an official audit trail of times where
         we decided to make exceptions.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #937 from pwendell/mima and squashes the following commits:
      
      7ee0db2 [Patrick Wendell] Better explanation for how to use MIMA excludes.
      d17d2214
  6. May 30, 2014
    • Prashant Sharma's avatar
      [SPARK-1971] Update MIMA to compare against Spark 1.0.0 · 79fa8fd4
      Prashant Sharma authored
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #910 from ScrapCodes/enable-mima/spark-core and squashes the following commits:
      
      79f3687 [Prashant Sharma] updated Mima to check against version 1.0
      1e8969c [Prashant Sharma] Spark core missed out on Mima settings. So in effect we never tested spark core for mima related errors.
      79fa8fd4
  7. May 12, 2014
    • Ankur Dave's avatar
      SPARK-1786: Reopening PR 724 · 0e2bde20
      Ankur Dave authored
      Addressing issue in MimaBuild.scala.
      
      Author: Ankur Dave <ankurdave@gmail.com>
      Author: Joseph E. Gonzalez <joseph.e.gonzalez@gmail.com>
      
      Closes #742 from jegonzal/edge_partition_serialization and squashes the following commits:
      
      8ba6e0d [Ankur Dave] Add concatenation operators to MimaBuild.scala
      cb2ed3a [Joseph E. Gonzalez] addressing missing exclusion in MimaBuild.scala
      5d27824 [Ankur Dave] Disable reference tracking to fix serialization test
      c0a9ae5 [Ankur Dave] Add failing test for EdgePartition Kryo serialization
      a4a3faa [Joseph E. Gonzalez] Making EdgePartition serializable.
      0e2bde20
  8. May 10, 2014
    • Ankur Dave's avatar
      Unify GraphImpl RDDs + other graph load optimizations · 905173df
      Ankur Dave authored
      This PR makes the following changes, primarily in e4fbd329aef85fe2c38b0167255d2a712893d683:
      
      1. *Unify RDDs to avoid zipPartitions.* A graph used to be four RDDs: vertices, edges, routing table, and triplet view. This commit merges them down to two: vertices (with routing table), and edges (with replicated vertices).
      
      2. *Avoid duplicate shuffle in graph building.* We used to do two shuffles when building a graph: one to extract routing information from the edges and move it to the vertices, and another to find nonexistent vertices referred to by edges. With this commit, the latter is done as a side effect of the former.
      
      3. *Avoid no-op shuffle when joins are fully eliminated.* This is a side effect of unifying the edges and the triplet view.
      
      4. *Join elimination for mapTriplets.*
      
      5. *Ship only the needed vertex attributes when upgrading the triplet view.* If the triplet view already contains source attributes, and we now need both attributes, only ship destination attributes rather than re-shipping both. This is done in `ReplicatedVertexView#upgrade`.
      
      Author: Ankur Dave <ankurdave@gmail.com>
      
      Closes #497 from ankurdave/unify-rdds and squashes the following commits:
      
      332ab43 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds
      4933e2e [Ankur Dave] Exclude RoutingTable from binary compatibility check
      5ba8789 [Ankur Dave] Add GraphX upgrade guide from Spark 0.9.1
      13ac845 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds
      a04765c [Ankur Dave] Remove unnecessary toOps call
      57202e8 [Ankur Dave] Replace case with pair parameter
      75af062 [Ankur Dave] Add explicit return types
      04d3ae5 [Ankur Dave] Convert implicit parameter to context bound
      c88b269 [Ankur Dave] Revert upgradeIterator to if-in-a-loop
      0d3584c [Ankur Dave] EdgePartition.size should be val
      2a928b2 [Ankur Dave] Set locality wait
      10b3596 [Ankur Dave] Clean up public API
      ae36110 [Ankur Dave] Fix style errors
      e4fbd32 [Ankur Dave] Unify GraphImpl RDDs + other graph load optimizations
      d6d60e2 [Ankur Dave] In GraphLoader, coalesce to minEdgePartitions
      62c7b78 [Ankur Dave] In Analytics, take PageRank numIter
      d64e8d4 [Ankur Dave] Log current Pregel iteration
      905173df
  9. May 07, 2014
    • Kan Zhang's avatar
      [SPARK-1460] Returning SchemaRDD instead of normal RDD on Set operations... · 967635a2
      Kan Zhang authored
      ... that do not change schema
      
      Author: Kan Zhang <kzhang@apache.org>
      
      Closes #448 from kanzhang/SPARK-1460 and squashes the following commits:
      
      111e388 [Kan Zhang] silence MiMa errors in EdgeRDD and VertexRDD
      91dc787 [Kan Zhang] Taking into account newly added Ordering param
      79ed52a [Kan Zhang] [SPARK-1460] Returning SchemaRDD on Set operations that do not change schema
      967635a2
  10. Apr 21, 2014
    • Tathagata Das's avatar
      [SPARK-1332] Improve Spark Streaming's Network Receiver and InputDStream API [WIP] · 04c37b6f
      Tathagata Das authored
      The current Network Receiver API makes it slightly complicated to right a new receiver as one needs to create an instance of BlockGenerator as shown in SocketReceiver
      https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala#L51
      
      Exposing the BlockGenerator interface has made it harder to improve the receiving process. The API of NetworkReceiver (which was not a very stable API anyways) needs to be change if we are to ensure future stability.
      
      Additionally, the functions like streamingContext.socketStream that create input streams, return DStream objects. That makes it hard to expose functionality (say, rate limits) unique to input dstreams. They should return InputDStream or NetworkInputDStream. This is still not yet implemented.
      
      This PR is blocked on the graceful shutdown PR #247
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #300 from tdas/network-receiver-api and squashes the following commits:
      
      ea27b38 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
      3a4777c [Tathagata Das] Renamed NetworkInputDStream to ReceiverInputDStream, and ActorReceiver related stuff.
      838dd39 [Tathagata Das] Added more events to the StreamingListener to report errors and stopped receivers.
      a75c7a6 [Tathagata Das] Address some PR comments and fixed other issues.
      91bfa72 [Tathagata Das] Fixed bugs.
      8533094 [Tathagata Das] Scala style fixes.
      028bde6 [Tathagata Das] Further refactored receiver to allow restarting of a receiver.
      43f5290 [Tathagata Das] Made functions that create input streams return InputDStream and NetworkInputDStream, for both Scala and Java.
      2c94579 [Tathagata Das] Fixed graceful shutdown by removing interrupts on receiving thread.
      9e37a0b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
      3223e95 [Tathagata Das] Refactored the code that runs the NetworkReceiver into further classes and traits to make them more testable.
      a36cc48 [Tathagata Das] Refactored the NetworkReceiver API for future stability.
      04c37b6f
  11. Apr 12, 2014
    • Tathagata Das's avatar
      [SPARK-1386] Web UI for Spark Streaming · 6aa08c39
      Tathagata Das authored
      When debugging Spark Streaming applications it is necessary to monitor certain metrics that are not shown in the Spark application UI. For example, what is average processing time of batches? What is the scheduling delay? Is the system able to process as fast as it is receiving data? How many records I am receiving through my receivers?
      
      While the StreamingListener interface introduced in the 0.9 provided some of this information, it could only be accessed programmatically. A UI that shows information specific to the streaming applications is necessary for easier debugging. This PR introduces such a UI. It shows various statistics related to the streaming application. Here is a screenshot of the UI running on my local machine.
      
      http://i.imgur.com/1ooDGhm.png
      
      This UI is integrated into the Spark UI running at 4040.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #290 from tdas/streaming-web-ui and squashes the following commits:
      
      fc73ca5 [Tathagata Das] Merge pull request #9 from andrewor14/ui-refactor
      642dd88 [Andrew Or] Merge SparkUISuite.scala into UISuite.scala
      eb30517 [Andrew Or] Merge github.com:apache/spark into ui-refactor
      f4f4cbe [Tathagata Das] More minor fixes.
      34bb364 [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      252c566 [Tathagata Das] Merge pull request #8 from andrewor14/ui-refactor
      e038b4b [Tathagata Das] Addressed Patrick's comments.
      125a054 [Andrew Or] Disable serving static resources with gzip
      90feb8d [Andrew Or] Address Patrick's comments
      89dae36 [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      72fe256 [Tathagata Das] Merge pull request #6 from andrewor14/ui-refactor
      2fc09c8 [Tathagata Das] Added binary check exclusions
      aa396d4 [Andrew Or] Rename tabs and pages (No more IndexPage.scala)
      f8e1053 [Tathagata Das] Added Spark and Streaming UI unit tests.
      caa5e05 [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      585cd65 [Tathagata Das] Merge pull request #5 from andrewor14/ui-refactor
      914b8ff [Tathagata Das] Moved utils functions to UIUtils.
      548c98c [Andrew Or] Wide refactoring of WebUI, UITab, and UIPage (see commit message)
      6de06b0 [Tathagata Das] Merge remote-tracking branch 'apache/master' into streaming-web-ui
      ee6543f [Tathagata Das] Minor changes based on Andrew's comments.
      fa760fe [Tathagata Das] Fixed long line.
      1c0bcef [Tathagata Das] Refactored streaming UI into two files.
      1af239b [Tathagata Das] Changed streaming UI to attach itself as a tab with the Spark UI.
      827e81a [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      168fe86 [Tathagata Das] Merge pull request #2 from andrewor14/ui-refactor
      3e986f8 [Tathagata Das] Merge remote-tracking branch 'apache/master' into streaming-web-ui
      c78c92d [Andrew Or] Remove outdated comment
      8f7323b [Andrew Or] End of file new lines, indentation, and imports (minor)
      0d61ee8 [Andrew Or] Merge branch 'streaming-web-ui' of github.com:tdas/spark into ui-refactor
      9a48fa1 [Andrew Or] Allow adding tabs to SparkUI dynamically + add example
      61358e3 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-web-ui
      53be2c5 [Tathagata Das] Minor style updates.
      ed25dfc [Andrew Or] Generalize SparkUI header to display tabs dynamically
      a37ad4f [Andrew Or] Comments, imports and formatting (minor)
      cd000b0 [Andrew Or] Merge github.com:apache/spark into ui-refactor
      7d57444 [Andrew Or] Refactoring the UI interface to add flexibility
      aef4dd5 [Tathagata Das] Added Apache licenses.
      db27bad [Tathagata Das] Added last batch processing time to StreamingUI.
      4d86e98 [Tathagata Das] Added basic stats to the StreamingUI and refactored the UI to a Page to make it easier to transition to using SparkUI later.
      93f1c69 [Tathagata Das] Added network receiver information to the Streaming UI.
      56cc7fb [Tathagata Das] First cut implementation of Streaming UI.
      6aa08c39
  12. Apr 08, 2014
    • Tathagata Das's avatar
      [SPARK-1331] Added graceful shutdown to Spark Streaming · 83ac9a4b
      Tathagata Das authored
      Current version of StreamingContext.stop() directly kills all the data receivers (NetworkReceiver) without waiting for the data already received to be persisted and processed. This PR provides the fix. Now, when the StreamingContext.stop() is called, the following sequence of steps will happen.
      1. The driver will send a stop signal to all the active receivers.
      2. Each receiver, when it gets a stop signal from the driver, first stop receiving more data, then waits for the thread that persists data blocks to BlockManager to finish persisting all receive data, and finally quits.
      3. After all the receivers have stopped, the driver will wait for the Job Generator and Job Scheduler to finish processing all the received data.
      
      It also fixes the semantics of StreamingContext.start and stop. It will throw appropriate errors and warnings if stop() is called before start(), stop() is called twice, etc.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #247 from tdas/graceful-shutdown and squashes the following commits:
      
      61c0016 [Tathagata Das] Updated MIMA binary check excludes.
      ae1d39b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into graceful-shutdown
      6b59cfc [Tathagata Das] Minor changes based on Andrew's comment on PR.
      d0b8d65 [Tathagata Das] Reduced time taken by graceful shutdown unit test.
      f55bc67 [Tathagata Das] Fix scalastyle
      c69b3a7 [Tathagata Das] Updates based on Patrick's comments.
      c43b8ae [Tathagata Das] Added graceful shutdown to Spark Streaming.
      83ac9a4b
  13. Mar 24, 2014
    • Patrick Wendell's avatar
      SPARK-1094 Support MiMa for reporting binary compatibility accross versions. · dc126f21
      Patrick Wendell authored
      This adds some changes on top of the initial work by @scrapcodes in #20:
      
      The goal here is to do automated checking of Spark commits to determine whether they break binary compatibility.
      
      1. Special case for inner classes of package-private objects.
      2. Made tools classes accessible when running `spark-class`.
      3. Made some declared types in MLLib more general.
      4. Various other improvements to exclude-generation script.
      5. In-code documentation.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      Author: Prashant Sharma <prashant.s@imaginea.com>
      Author: Prashant Sharma <scrapcodes@gmail.com>
      
      Closes #207 from pwendell/mima and squashes the following commits:
      
      22ae267 [Patrick Wendell] New binary changes after upmerge
      6c2030d [Patrick Wendell] Merge remote-tracking branch 'apache/master' into mima
      3666cf1 [Patrick Wendell] Minor style change
      0e0f570 [Patrick Wendell] Small fix and removing directory listings
      647c547 [Patrick Wendell] Reveiw feedback.
      c39f3b5 [Patrick Wendell] Some enhancements to binary checking.
      4c771e0 [Prashant Sharma] Added a tool to generate mima excludes and also adapted build to pick automatically.
      b551519 [Prashant Sharma] adding a new exclude after rebasing with master
      651844c [Prashant Sharma] Support MiMa for reporting binary compatibility accross versions.
      dc126f21
Loading