Spark Streaming resource dynamic application and dynamic control consumption rate analysis

Source: Internet
Author: User

Contents of this issue:

    • Spark Streaming Resource dynamic allocation
    • Spark streaming dynamically control consumption rate

  Why dynamic processing is required:

Spark is a coarse-grained resource allocation, that is, by default allocating a good resource before computing, coarse granularity has a benefit, because the resources are assigned to you in advance, when there is a calculation of the task of direct use,

the bad aspect of coarse granularity is from spark Streaming angle of peak value, low peak, in high and low peak time required resources are not the same, if the resource allocation according to High Peak, at low peak is the waste of resources,

as the spark streaming program itself continues to run on resources consumption and management are also factors that need to be considered.

One, the Spark streaming resource dynamic allocation:

Dynamic Resource allocation Source:

    

Set its configuration in sparkconf

    

  

  The frequency of the timer to constantly scan the executor, the running scheduler is to run in different executor, need to dynamically increase executor or reduce executor, for example, to determine a 60-second time interval

of the Executor a If the task is not running, it will remove the executor. How the executor is reduced because the executor running in the current application will have a data structure in the driver that keeps a reference to it, each time the task is scheduled

the time will iterate through the columns of the executor table, and then query the list of available resources, depending on whether the clock in this class is constantly looping to see if the conditions for adding or removing executor are met, and if the conditions for adding or removing are met

triggering executor to add and remove.

    

    From the spark streaming point of view, the dynamic resource adjustment that spark streaming is dealing with is Executor's resource dynamic adjustment, what is the biggest challenge?

Spark streaming is run according to Bachduration, perhaps this bachduration need a lot of resources, the next without so many resources, the current bachduration resources have not been adjusted to complete its operation has expired.

  

Second, dynamic control consumption rate:

Spark streaming elastic mechanism, you can see how the flow incoming data is handled, the relationship between the speed of processing can be processed in time, if it is too late to deal with, will dynamically control the speed of data flow in.

The spark streaming itself has a rate control, which can be adjusted manually using manual controls that require a sense of the speed at which the spark streaming is processed, according to Bachduration

Stream incoming data to control its speed, you can adjust the bachduration into more data or less data.

remark:
      • Data from: Liaoliang (Spark release version customization)
      • Sina Weibo:http://www.weibo.com/ilovepains

Spark Streaming resource dynamic application and dynamic control consumption rate analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.