11.Spark Streaming source code interpretation of the driver Receivertracker architecture design and concrete implementation of the thorough research

Source: Internet
Author: User

In the previous article, we analyzed the process of receiver receiving data in detail, and sent the meta-information of data to Receivertracker during receiver receiving data:



This article will analyze the architecture design and concrete implementation of receivertracker in detail. First,the main function of Receivertracker     The main functions of Receivertracker are:1.start receivers on the executor2.accept Receiver's registration3.manage receiver data metadata with Receivedblocktracker
4.accept various messages sent by receiver and handle them accordingly5.Update receiver's rate of receiving data (i.e., current limit)6.constantly waiting for the receivers to run, restart receiver as long as the receivers stops running. This is the fault-tolerant function of receiver. 7.Stop Receivers8.report the error message sent by receiver. Second, Receivertracker Detailed Functions 2.1 Start receiver and manage the metadata of receiver receiving data
 First,inside the Receivertracker there is aReceivertrackerendpoint The endpoint variable of the communication body, endpoint used to communicate with receiver andReceivertracker itself for message communication. Thisthe Receivertrackerendpoint communication is initialized when the Receivertracker is started:
 
Receivertracker when the receiver is activated,Receivertrackerendpointthe Startallreceivers (receivers) message is sent by the endpoint variable of the communication body: 
When receiver starts, itReceivertracker Register, tellReceivertracker Self-initiated success: 
The Trackerendpoint in the code isin ReceivertrackerReceivertrackerendpointa reference to the endpoint of the communication body.  Receiver will continue to encapsulate the received data into blocks and push the block to Blockmanager management,once these blocks are pushed to Blockmanager, Receiversupervisor will send the block's meta-information toReceivertracker's endpoint: 
Can seeReceiversupervisor toReceivertracker's endpoint sent a addblock (blockinfo) message: Receivertracker receivedAddblock (blockinfo) message, a thread is started to process:
 
Receivertracker receivedAddblock (blockinfo) message, the Addblock (Receiveedblockinfo) method is called for processing, and the following isAddblock'sSource: 
Here actually called the Addblock method of Receivedblocktracker, Receivedblocktracker is REceivedblocktracker object, it is in theReceivertracker is created when instantiated: 
Here's a look at Receivedblocktracker'sAddblock Method: 
Can seeReceivedblocktracker'sThe Addblock method adds the meta information of the block to a queue of queues, which is eventually added to astreamidtounallocatedblockqueuesHashMap, where key is Streamid and the value is the corresponding block queue for the Streamid.  
2.2 assigning block to batchwhen the spark streaming application dynamically generates the job, Jobgenerator calls the Generatejobs method, in which the batch is assigned the received block 
This calls the Allocatedblocktobatch method of Receivertracker in Jobscheduler, where theReceivertracker is R .Eceivertracker object, take a look at the implementation of this method: 
You can see that the final call to the ReceivedblocktrackerAllocatedblocktobatch Method: 
Here first according to Streamid, fromThe received block queue is taken out of the streamidtounallocatedblockqueues, and the Streamid and block queues are encapsulated as allocatedblocks, and finally according to the batchtime the corresponding allocatedblocks objects are added to the timetoallocatedblocks,Timetoallocatedblocks is a hashmap: 
This allows the block of batch to be allocated for completion. other messages processed by 2.3 receivertrackerin Receivertrackerthe Receivertrackerendpoint Receive method defines the processing logic for various messages:
 (1) after receiving startallreceivers (receivers) message,The Receivertracker assigns executor to the receivers and initiates the corresponding receiver on the executor 
(2) whenReceivertracker monitoring towhen receiver exits the return, it givesReceivertrackerendpoint sends a restarttracker (receiver) message. When the message is received, the receiver is reassigned executor bootReceiver (if the original executor is functioning correctly, restart on the original executor, or reschedule executor).  (3) When the job of spark streaming is finished, Jobscheduler calls the Handlejobcompletion method and eventually calls thecleanupoldblocksandbatchesmethod sends a CLEANUPOLDBLOCKS message to endpoint:When the message is received, it is routed to receiver for block cleanup. (4)Updatereceiverratelimit Message   Roger thatafter the message is Updatereceiverratelimit, it is routed to receiver, and when receiver receives the message, it calls Blockgenerator's Update method to update the block build rate.                          



From Wiznote



11.Spark Streaming source code interpretation of the driver Receivertracker architecture design and concrete implementation of the thorough research

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.