Skip to content
Snippets Groups Projects
  • Tathagata Das's avatar
    04c37b6f
    [SPARK-1332] Improve Spark Streaming's Network Receiver and InputDStream API [WIP] · 04c37b6f
    Tathagata Das authored
    The current Network Receiver API makes it slightly complicated to right a new receiver as one needs to create an instance of BlockGenerator as shown in SocketReceiver
    https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala#L51
    
    Exposing the BlockGenerator interface has made it harder to improve the receiving process. The API of NetworkReceiver (which was not a very stable API anyways) needs to be change if we are to ensure future stability.
    
    Additionally, the functions like streamingContext.socketStream that create input streams, return DStream objects. That makes it hard to expose functionality (say, rate limits) unique to input dstreams. They should return InputDStream or NetworkInputDStream. This is still not yet implemented.
    
    This PR is blocked on the graceful shutdown PR #247
    
    Author: Tathagata Das <tathagata.das1565@gmail.com>
    
    Closes #300 from tdas/network-receiver-api and squashes the following commits:
    
    ea27b38 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
    3a4777c [Tathagata Das] Renamed NetworkInputDStream to ReceiverInputDStream, and ActorReceiver related stuff.
    838dd39 [Tathagata Das] Added more events to the StreamingListener to report errors and stopped receivers.
    a75c7a6 [Tathagata Das] Address some PR comments and fixed other issues.
    91bfa72 [Tathagata Das] Fixed bugs.
    8533094 [Tathagata Das] Scala style fixes.
    028bde6 [Tathagata Das] Further refactored receiver to allow restarting of a receiver.
    43f5290 [Tathagata Das] Made functions that create input streams return InputDStream and NetworkInputDStream, for both Scala and Java.
    2c94579 [Tathagata Das] Fixed graceful shutdown by removing interrupts on receiving thread.
    9e37a0b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
    3223e95 [Tathagata Das] Refactored the code that runs the NetworkReceiver into further classes and traits to make them more testable.
    a36cc48 [Tathagata Das] Refactored the NetworkReceiver API for future stability.
    04c37b6f
    History
    [SPARK-1332] Improve Spark Streaming's Network Receiver and InputDStream API [WIP]
    Tathagata Das authored
    The current Network Receiver API makes it slightly complicated to right a new receiver as one needs to create an instance of BlockGenerator as shown in SocketReceiver
    https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala#L51
    
    Exposing the BlockGenerator interface has made it harder to improve the receiving process. The API of NetworkReceiver (which was not a very stable API anyways) needs to be change if we are to ensure future stability.
    
    Additionally, the functions like streamingContext.socketStream that create input streams, return DStream objects. That makes it hard to expose functionality (say, rate limits) unique to input dstreams. They should return InputDStream or NetworkInputDStream. This is still not yet implemented.
    
    This PR is blocked on the graceful shutdown PR #247
    
    Author: Tathagata Das <tathagata.das1565@gmail.com>
    
    Closes #300 from tdas/network-receiver-api and squashes the following commits:
    
    ea27b38 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
    3a4777c [Tathagata Das] Renamed NetworkInputDStream to ReceiverInputDStream, and ActorReceiver related stuff.
    838dd39 [Tathagata Das] Added more events to the StreamingListener to report errors and stopped receivers.
    a75c7a6 [Tathagata Das] Address some PR comments and fixed other issues.
    91bfa72 [Tathagata Das] Fixed bugs.
    8533094 [Tathagata Das] Scala style fixes.
    028bde6 [Tathagata Das] Further refactored receiver to allow restarting of a receiver.
    43f5290 [Tathagata Das] Made functions that create input streams return InputDStream and NetworkInputDStream, for both Scala and Java.
    2c94579 [Tathagata Das] Fixed graceful shutdown by removing interrupts on receiving thread.
    9e37a0b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
    3223e95 [Tathagata Das] Refactored the code that runs the NetworkReceiver into further classes and traits to make them more testable.
    a36cc48 [Tathagata Das] Refactored the NetworkReceiver API for future stability.