Spark Streaming Data reception process

Source: Internet
Author: User

Sparkstreaming Source Analysis section from the source angle, describes the streaming execution of the code call process. Below is the process of receiving the conversion phase and then a simple analysis, for the analysis of backpressure preparation.

The whole process of sparkstreaming is divided into two stages: the data receiving transformation phase and the job generation and execution phase. The two phases are linked by the block generated by the data reception transformation phase. is based on the recevier of the data received source conversion part of the Code analysis.

The data reception conversion process can be divided into the following key steps:

    1. Receiver receives an external data stream, which is sent to Blockgenerator to be stored in Arraybuffer, and is licensed before storage (specified by "Spark.streaming.receiver.maxRate"). Spark 1.5 is automatically calculated by Backpressure, which represents the maximum rate at which it can be accessed, each storing one piece of data for a license, and blocking if the license receipt is not acquired.

    2. A timer is defined in the Blockgenerater, and the data in the Arraybuffer is taken out in accordance with the set interval timing, which is packed into block, and store The block in blocksforpushing (Block queue Arrayblockingqueue) and empty the Arraybuffer.

    3. The Blockpushingthread thread in the Blockgenerater removes the block information from the blocking queue and sends the message through the Listener (listener) in Onpushblock way to Receiversupervisor.

    4. Receiversupervisor receives the message, it processes the data carried in the message, it stores the data by calling Blockmanager, and reports the stored result information to Receivertracker

    5. After Receivertracker receives the message, it stores the information in the Unassigned Block queue (Streamidtounallocatedblock) and waits for Jobgenerator to assign it to the RDD when the job is generated.

Spark Streaming Data reception process

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.