Spark version Custom 13th day: Driver Fault tolerance

Source: Internet
Author: User

Contents of this issue

1. Receivedblocktracker Fault-tolerant security

2, Dstreamgraph and Jobgenerator fault-tolerant security

All data that cannot be streamed in real time is invalid data. In the stream processing era, Sparkstreaming has a strong appeal, and development prospects, coupled with Spark's ecosystem, streaming can easily call other powerful frameworks such as Sql,mllib, it will eminence.

The spark streaming runtime is not so much a streaming framework on spark core as one of the most complex applications on spark core. If you can master the complex application of spark streaming, then other complex applications are a cinch. It is also a general trend to choose spark streaming as a starting point for custom versions.

From the data plane, Receivedblocktracker records metadata information for the entire spark streaming application.

From the scheduling level, dstreamgraph and Jobgenerator are at the heart of the spark streaming dispatch, recording the current schedule to which progress, and the business.

Start with the receivertracker angle.

  

If you turn on Wal, write the metadata to Wal and join Receivedblockqueue

def addblock (receivedblockinfo:receivedblockinfo): Boolean = {try {   val writeresult = WriteToLog ( Blockadditionevent (receivedblockinfo))   if (writeresult) {     synchronized {       Getreceivedblockqueue ( Receivedblockinfo.streamid) + = Receivedblockinfo     }     logdebug (S "Stream ${receivedblockinfo.streamid} Received "+       S" Block ${receivedblockinfo.blockstoreresult.blockid} ")   } else {     logdebug (S" Failed to Acknowledge Stream ${receivedblockinfo.streamid} receiving "+       S" Block ${ ReceivedBlockInfo.blockStoreResult.blockId} in the Write Ahead Log. ")   }   Writeresult} catch {case   nonfatal (e) =     LogError (S "Error adding block $receivedBlockInfo", E)     false }}

Here's a look at the WriteToLog method

Private def WriteToLog (record:receivedblocktrackerlogevent): Boolean = {    if (iswriteaheadlogenabled) {      Logtrace (S "Writing record: $record")      try {        writeAheadLogOption.get.write (Bytebuffer.wrap (Utils.serialize ( Record)),          Clock.gettimemillis ())        true      } catch {case        nonfatal (e) =          logwarning (S "Exception Thrown while writing record: $record to the Writeaheadlog. ", E)          false      }    } else {      true    }  }

Take a look at Jobscheduler. To allocate blocks for allocating batch

def allocateblockstobatch (batchtime:time): Unit = synchronized {    if (Lastallocatedbatchtime = = NULL | | batchtime > Lastallocatedbatchtime) {      val streamidtoblocks = streamids.map {streamid =          (Streamid, Getreceivedblockqueue (Streamid). Dequeueall (x = True))      }.tomap      val allocatedblocks = Allocatedblocks ( Streamidtoblocks)      if (WriteToLog (Batchallocationevent (Batchtime, allocatedblocks))) {        Timetoallocatedblocks.put (Batchtime, allocatedblocks)        lastallocatedbatchtime = Batchtime      } else {        Loginfo (S "Possibly processed batch $batchTime need to being processed again in WAL recovery")      }    } else {      logIn Fo (S "Possibly processed batch $batchTime need to being processed again in WAL recovery")    }  }

Although this writetolog is called every time, the inside of the method will determine if the Wal is open.

And then into the writetolog interior.

Private def WriteToLog (record:receivedblocktrackerlogevent): Boolean = {    if (iswriteaheadlogenabled) {      Logtrace (S "Writing record: $record")      try {        writeAheadLogOption.get.write (Bytebuffer.wrap (Utils.serialize ( Record)),          Clock.gettimemillis ())        true      } catch {case        nonfatal (e) =          logwarning (S "Exception Thrown while writing record: $record to the Writeaheadlog. ", E)          false      }    } else {      true    }  }

After the write is complete, receivertracker partial fault tolerance is the data partial completion

Next look at the job generation perspective, that is, the scheduling level to see fault tolerance

Call Docheckpoint according to Dstreamgraph each time a job is generated based on a fixed batchinterval.

  

Private Def docheckpoint (Time:time, Clearcheckpointdatalater:boolean) {    if (shouldcheckpoint && (Time-gra ph.zerotime). ismultipleof (ssc.checkpointduration)) {      Loginfo ("Checkpointing Graph for Time" + time)      Ssc.graph.updateCheckpointData (Time)      Checkpointwriter.write (New Checkpoint (SSC, Time), Clearcheckpointdatalater)    }  }

In the modification method, the Updatecheckpointdata method is called according to the checkpoint of each update time.

def updatecheckpointdata (time:time) {    loginfo ("Updating Checkpoint data for time" + time)    this.synchronized {
   outputstreams.foreach (_.updatecheckpointdata (Time))    }    loginfo ("Updated Checkpoint data for time" + time)  }  Private[streaming] def updatecheckpointdata (currenttime:time) {    logdebug ("Updating Checkpoint data for time" + Curr Enttime)    checkpointdata.update (currenttime)    Dependencies.foreach (_.updatecheckpointdata (currentTime))    logdebug ("Updated Checkpoint data for time" + CurrentTime + ":" + checkpointdata)  }

In summary, the Receivedblocktracker handles the data plane, which, by way of Wal,

and Dstreamgraph and Jobgenerator are from the dispatch level, through the checkpoint way

Note:

Data from: Dt_ Big Data Dream Factory (spark release version customization)

For more private content, please follow the public number: Dt_spark

If you are interested in big data spark, you can listen to it free of charge by Liaoliang teacher every night at 20:00 Spark Permanent free public class, address yy room Number: 68917580

Spark Version Custom 13th day: Driver tolerance

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.