Although spark streaming defines commonly used receiver, it is sometimes necessary to customize its own receiver. For a custom receiver, you only need to implement the receiver abstract class of spark streaming. The implementation of receiver requires simply implementing two methods:
1, OnStart (): Receive data.
2, OnStop (): Stop receiving data. General OnStart () should not be blocked and should start a new thread for complex data reception. The OnStop () method is responsible for ensuring that the threads that receive the data are stopped,
Receiver
is closed when called, can do some close work. The thread that is responsible for receiving the data can use isstopped () to determine whether to stop the data reception. For the received data, it needs to be stored in the spark frame, using the store () method. The receiver abstraction class provides 4 store () methods, which can be used for storage of: 1,
Single Small Data
2. Block data in array form
3,bytebuffer form of block data
4,iterator form of block data
The implementation of these 4 store () methods is to pass data directly to Receiversupervisor for storage. So to customize a receiver, just create a data-receiving thread in OnStart () and store () to the Spark streamimg framework when you receive the data.
The following code is a spark streaming receiver based on the XMEMCACHED protocol message queue:
Import Fqueue.fqueuereceiverimport Org.apache.spark.Loggingimport Org.apache.spark.storage.StorageLevelimport Org.apache.spark.streaming.receiver.Receiverclass Fqueuestreamingreceiver (Val address:string, Val Connectionpoolsize:int, Val timeout:int) extends receiver[string] (storagelevel.memory_and_disk_2) with Logging {Priva Te var receiver:option[fqueuereceiver] = None def onStart () {New Thread ("Socket receiver") {override def run () {Receive ()}}. Start ()} def onStop (): unit = {receiver foreach {_.stop ()}} private Def receive (): unit = {val Fqueuerecei ver = new Fqueuereceiver (address, connectionpoolsize, timeOut) receiver = Some (fqueuereceiver) receiver foreach {_. Connect ()} try {var stop = False while (!isstopped () &&!stop) {val data = fqueuereceive R.dequeue ("track_bodao2015*") data match {case Some (str) = + store (str) case None + = Thread . Sleep (//stop) = true} } receiver foreach {_.stop ()}} catch {case e:exception = println ("Get Data from Fqueue err! Pleace sure the server is Live ") println (e.getmessage) println (e.getstacktracestring) receiver FOREAC h {_.stop ()}}}}
After you have customized receiver for spark streaming , you can use it in your app:
def main (args:array[string]) { new Thread ("Fqueue sender") { override def run () {senddata ()} }.start ()
val config = new sparkconf (). Setappname ("Testfqueue"). Setmaster ("local[2]") val ssc = new StreamingContext (config , Seconds (5)) val lines = Ssc.receiverstream (new Fqueuestreamingreceiver ("localhost:18740", 4, 4000)) Lines.print () Ssc.start () ssc.awaittermination () }
Link:
Http://spark.apache.org/docs/latest/streaming-custom-receivers.html
Xmemcached-based Message Queuing
Receiver Realization of the source code in this paper
If you like, send a Star (⊙o⊙) to the GitHub project ... Your start is my motivation!
Customizing the spark streaming receiver based on xmemcached protocol Message Queuing