Skip to content
Snippets Groups Projects
  1. Apr 12, 2014
    • Tathagata Das's avatar
      [SPARK-1386] Web UI for Spark Streaming · 6aa08c39
      Tathagata Das authored
      When debugging Spark Streaming applications it is necessary to monitor certain metrics that are not shown in the Spark application UI. For example, what is average processing time of batches? What is the scheduling delay? Is the system able to process as fast as it is receiving data? How many records I am receiving through my receivers?
      
      While the StreamingListener interface introduced in the 0.9 provided some of this information, it could only be accessed programmatically. A UI that shows information specific to the streaming applications is necessary for easier debugging. This PR introduces such a UI. It shows various statistics related to the streaming application. Here is a screenshot of the UI running on my local machine.
      
      http://i.imgur.com/1ooDGhm.png
      
      This UI is integrated into the Spark UI running at 4040.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #290 from tdas/streaming-web-ui and squashes the following commits:
      
      fc73ca5 [Tathagata Das] Merge pull request #9 from andrewor14/ui-refactor
      642dd88 [Andrew Or] Merge SparkUISuite.scala into UISuite.scala
      eb30517 [Andrew Or] Merge github.com:apache/spark into ui-refactor
      f4f4cbe [Tathagata Das] More minor fixes.
      34bb364 [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      252c566 [Tathagata Das] Merge pull request #8 from andrewor14/ui-refactor
      e038b4b [Tathagata Das] Addressed Patrick's comments.
      125a054 [Andrew Or] Disable serving static resources with gzip
      90feb8d [Andrew Or] Address Patrick's comments
      89dae36 [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      72fe256 [Tathagata Das] Merge pull request #6 from andrewor14/ui-refactor
      2fc09c8 [Tathagata Das] Added binary check exclusions
      aa396d4 [Andrew Or] Rename tabs and pages (No more IndexPage.scala)
      f8e1053 [Tathagata Das] Added Spark and Streaming UI unit tests.
      caa5e05 [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      585cd65 [Tathagata Das] Merge pull request #5 from andrewor14/ui-refactor
      914b8ff [Tathagata Das] Moved utils functions to UIUtils.
      548c98c [Andrew Or] Wide refactoring of WebUI, UITab, and UIPage (see commit message)
      6de06b0 [Tathagata Das] Merge remote-tracking branch 'apache/master' into streaming-web-ui
      ee6543f [Tathagata Das] Minor changes based on Andrew's comments.
      fa760fe [Tathagata Das] Fixed long line.
      1c0bcef [Tathagata Das] Refactored streaming UI into two files.
      1af239b [Tathagata Das] Changed streaming UI to attach itself as a tab with the Spark UI.
      827e81a [Tathagata Das] Merge branch 'streaming-web-ui' of github.com:tdas/spark into streaming-web-ui
      168fe86 [Tathagata Das] Merge pull request #2 from andrewor14/ui-refactor
      3e986f8 [Tathagata Das] Merge remote-tracking branch 'apache/master' into streaming-web-ui
      c78c92d [Andrew Or] Remove outdated comment
      8f7323b [Andrew Or] End of file new lines, indentation, and imports (minor)
      0d61ee8 [Andrew Or] Merge branch 'streaming-web-ui' of github.com:tdas/spark into ui-refactor
      9a48fa1 [Andrew Or] Allow adding tabs to SparkUI dynamically + add example
      61358e3 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-web-ui
      53be2c5 [Tathagata Das] Minor style updates.
      ed25dfc [Andrew Or] Generalize SparkUI header to display tabs dynamically
      a37ad4f [Andrew Or] Comments, imports and formatting (minor)
      cd000b0 [Andrew Or] Merge github.com:apache/spark into ui-refactor
      7d57444 [Andrew Or] Refactoring the UI interface to add flexibility
      aef4dd5 [Tathagata Das] Added Apache licenses.
      db27bad [Tathagata Das] Added last batch processing time to StreamingUI.
      4d86e98 [Tathagata Das] Added basic stats to the StreamingUI and refactored the UI to a Page to make it easier to transition to using SparkUI later.
      93f1c69 [Tathagata Das] Added network receiver information to the Streaming UI.
      56cc7fb [Tathagata Das] First cut implementation of Streaming UI.
      6aa08c39
  2. Apr 08, 2014
    • Tathagata Das's avatar
      [SPARK-1331] Added graceful shutdown to Spark Streaming · 83ac9a4b
      Tathagata Das authored
      Current version of StreamingContext.stop() directly kills all the data receivers (NetworkReceiver) without waiting for the data already received to be persisted and processed. This PR provides the fix. Now, when the StreamingContext.stop() is called, the following sequence of steps will happen.
      1. The driver will send a stop signal to all the active receivers.
      2. Each receiver, when it gets a stop signal from the driver, first stop receiving more data, then waits for the thread that persists data blocks to BlockManager to finish persisting all receive data, and finally quits.
      3. After all the receivers have stopped, the driver will wait for the Job Generator and Job Scheduler to finish processing all the received data.
      
      It also fixes the semantics of StreamingContext.start and stop. It will throw appropriate errors and warnings if stop() is called before start(), stop() is called twice, etc.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #247 from tdas/graceful-shutdown and squashes the following commits:
      
      61c0016 [Tathagata Das] Updated MIMA binary check excludes.
      ae1d39b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into graceful-shutdown
      6b59cfc [Tathagata Das] Minor changes based on Andrew's comment on PR.
      d0b8d65 [Tathagata Das] Reduced time taken by graceful shutdown unit test.
      f55bc67 [Tathagata Das] Fix scalastyle
      c69b3a7 [Tathagata Das] Updates based on Patrick's comments.
      c43b8ae [Tathagata Das] Added graceful shutdown to Spark Streaming.
      83ac9a4b
  3. Mar 24, 2014
    • Patrick Wendell's avatar
      SPARK-1094 Support MiMa for reporting binary compatibility accross versions. · dc126f21
      Patrick Wendell authored
      This adds some changes on top of the initial work by @scrapcodes in #20:
      
      The goal here is to do automated checking of Spark commits to determine whether they break binary compatibility.
      
      1. Special case for inner classes of package-private objects.
      2. Made tools classes accessible when running `spark-class`.
      3. Made some declared types in MLLib more general.
      4. Various other improvements to exclude-generation script.
      5. In-code documentation.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      Author: Prashant Sharma <prashant.s@imaginea.com>
      Author: Prashant Sharma <scrapcodes@gmail.com>
      
      Closes #207 from pwendell/mima and squashes the following commits:
      
      22ae267 [Patrick Wendell] New binary changes after upmerge
      6c2030d [Patrick Wendell] Merge remote-tracking branch 'apache/master' into mima
      3666cf1 [Patrick Wendell] Minor style change
      0e0f570 [Patrick Wendell] Small fix and removing directory listings
      647c547 [Patrick Wendell] Reveiw feedback.
      c39f3b5 [Patrick Wendell] Some enhancements to binary checking.
      4c771e0 [Prashant Sharma] Added a tool to generate mima excludes and also adapted build to pick automatically.
      b551519 [Prashant Sharma] adding a new exclude after rebasing with master
      651844c [Prashant Sharma] Support MiMa for reporting binary compatibility accross versions.
      dc126f21
Loading