Spark Streaming source interpretation of receiver generation full life cycle thorough research and thinking

Source: Internet
Author: User

Contents of this issue:

    • The way receiver starts is conceived
    • Receiver Start source thorough analysis

  

Multiple input source input started, receiver failed to start, as long as our cluster exists in the hope that receiver boot success, running process based on each Teark boot may fail to run.

Starting a different receiver for an application that uses a different RDD partion to represent different receiver, and then starts when different partion execution planes are different teark, and each teark starts with a real start to a receiver.

Pros: This is a simple, simple and ingenious way to use a job on the spark core.

Cons: May fail, this receiver failure during operation will affect execution, the job will fail and the application will fail

 Source data input Process source code:

  

  

  

  Receiver start-up process source code:

  

  

  

  

Based on Receiverinputdstreams to get receiver instances,Receiverinputdstreams is from the driver side, a top-level abstraction from spark, spark The streaming job runs as an RDD, and the object represents all input streams, called source objects.

Receiver is a logical level, then distributes them on the worker node, then runs on the physical plane and runs on top of the worker collection.

  

  

  

  

  Loop to receive all data:

  

  Endpoint operation of the data source:

  

  

  

  Call Startreceiver:

  

  

  

  

  

Note:

      • Data from: Liaoliang (Spark release version customization)
      • Sina Weibo:http://www.weibo.com/ilovepains

Spark Streaming source interpretation of receiver generation full life cycle thorough research and thinking

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.