Spark Streaming Release note 17: Dynamic allocation of resources and dynamic control of consumption rates

Last Update:2016-05-31 Source: Internet

Author: User

Tags add time

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article explains from two aspects:

Advanced Features:

1. Dynamic distribution of Spark streaming resources

2, Spark streaming dynamic control consumption rate

Principle analysis, dynamic control consumption rate there is a set of theories behind it, resource dynamic distribution also has a theory.

Let's start with the theory and discuss it later.

Why dynamic resource allocation and dynamic control rate?

Spark default is to allocate resources first, and then calculate, coarse-grained allocation, resources are allocated well in advance, there are computational tasks to allocate resources in advance;

Bad place: From spark streaming point of view there are peak values and low peaks, if resource allocation from peak value, low peak considerations have a lot of resources wasted.

In fact, the spark streaming reference to Storm's design idea, built on its basis the spark streaming2.0x kernel has

Great change, the greatest benefit of this framework is the partnership with the Brotherhood framework. We consider that the spark streaming resource allocation is allocated by peak value, which creates a waste of pre-allocated resources, especially

Is that low peaks cause a lot of wasted resources.

Spark streaming itself is based on Spark Core, the core of spark core is Sparkcontext object, starting from 556 lines of Sparkcontext class code, support dynamic allocation of resources, the source code is as follows:

Optionally scale number of executors dynamically based on workload. Exposed for testing.
Valdynamicallocationenabled = Utils.isdynamicallocationenabled(_conf)
if(!dynamicallocationenabled &&_conf. Getboolean ("spark.dynamicAllocation.enabled",false)) {
Logwarning ("Dynamic Allocation and Num executors both set, thusdynamic Allocation disabled.")
}

_executorallocationmanager=
off(dynamicallocationenabled) {
Some(NewExecutorallocationmanager ( This,Listenerbus,_conf))
}Else{
None
}
_executorallocationmanager. foreach (_.start ())

_cleaner=
if(_conf. Getboolean ("Spark.cleaner.referenceTracking",true)) {
Some(NewContextcleaner ( This))
}Else{
None
}
_cleaner. foreach (_.start ())

Through configuration parameters: spark.dynamicAllocation.enabled See if you need to turn on dynamic allocation of executor:

/**
* Return Whether dynamic allocation is enabled in the given conf
* Dynamic allocation and explicitly setting the number of executors are inherently
* Incompatible. In environments where dynamic allocation are turned on by default,
* The latter should override the former (SPARK-9092).
*/
Isdynamicallocationenabled (conf:sparkconf): Boolean = {
false) &&
Conf.getint ("Spark.executor.instances", 0) = = 0
}

Based on the code discovery, you can constantly set the value of the spark.dynamicAllocation.enabled parameter while the program is running, and use the Executorallocationmanager class if the support resource is dynamically allocated:

/**
* An agent which dynamically allocates and removes executors based on the workload.
* The Executorallocationmanager maintains a moving target number of executors which is periodically
* Synced to the cluster manager. The target starts at a configured initial value and changes with
* The number of pending and running tasks.
* Decreasing the target number of executors happens when the current target was more than needed to
* Handle the current load. The target number of executors is always truncated to the number of
* Executors that could run all current running and pending the tasks at once.
*
* Increasing the target number of executors happens in response to backlogged tasks waiting to be
* Scheduled. If The scheduler queue is not drained in N seconds and then new executors is added. If
* The queue persists for another M seconds, then more executors is added and so on. The number
* added in each round increases exponentially from the previous round until a upper bound has been
* reached. The upper bound is based both in a configured property and on the current number of
* Running and pending tasks, as described above.
*
* The rationale for the exponential increase are twofold: (1) Executors should be added slowly
* In the beginning in case the number of extra executors needed turns off to be small. Otherwise,
* We may add more executors than we need just to remove them later. (2) Executors should be added
* Quickly over time in case the maximum number of executors are very high. Otherwise, it'll take
* A long time to ramp up under heavy workloads.
*
* The Remove policy is simpler:if an executor have been idle for K seconds, meaning it have not
* Been scheduled to run any tasks and then it's removed.
*
* There is no retry logic in either case because we do the assumption that the cluster manager
* would eventually fulfill all requests it receives asynchronously.
*
* The relevant Spark properties include the following:
*
* Spark.dynamicallocation.enabled-whether This feature is enabled
* Spark.dynamicallocation.minexecutors-lower bound on the number of executors
* Spark.dynamicallocation.maxexecutors-upper bound on the number of executors
* Spark.dynamicallocation.initialexecutors-number of executors to start with
*
* Spark.dynamicAllocation.schedulerBacklogTimeout (M)-
* If There is backlogged tasks for this duration, add new executors
*
* Spark.dynamicAllocation.sustainedSchedulerBacklogTimeout (N)-
* If The backlog is sustained-duration, add more executors
* This was used only after the initial backlog timeout is exceeded
*
* Spark.dynamicAllocation.executorIdleTimeout (K)-
* If An executor have been idle for this duration, remove it
*/
Private[Spark]classExecutorallocationmanager (
Client:executorallocationclient,
Listenerbus:livelistenerbus,
conf:sparkconf)
extendsLogging {

Allocationmanager =

ImportExecutorallocationmanager._

Lower and upper bounds on the number of executors.
Private Valminnumexecutors= Conf.getint ("Spark.dynamicAllocation.minExecutors", 0)
Private Valmaxnumexecutors= Conf.getint ("Spark.dynamicAllocation.maxExecutors",
Integer.Max_value)

Dynamically controls the number of executors executed. Scan executor situation, running stage, increase executor or reduce the number of executor, for example, reduce the executor situation; For example, 60 seconds to find a task that is not running will remove executor The current application contains all the executors that are started, and the driver maintains a reference to executors.

Because of the clock, there is a continuous cycle, there is the addition and deletion of exector operations.

Dynamic is the clock, every fixed period to see. If you need to delete, send a Kill message, add a message to the worker and send a executor.

Let's take a look at Master's Scheduler method:

/**
* Schedule the currently available resources among waiting apps. This method would be called
* Every time a new app joins or resource availability changes.
*/
Private DefSchedule (): Unit = {
if( State! = Recoverystate.ALIVE) {return}
Drivers take strict precedence over executors
 ValShuffledworkers = Random.shuffle (Workers)//randomization helps balance drivers
 for(Worker <-ShuffledworkersifWorker. State= = Workerstate.ALIVE) {
 for(Driver <-waitingdrivers) {
if(Worker.memoryfree >= driver.desc.mem && worker.coresfree >= driver.desc.cores) {
Launchdriver (worker, driver)
 waitingdrivers-= Driver
}
}
}
Startexecutorsonworkers ()
}

The need to implement dynamic resource scheduling requires a clock to assist, the resource default allocation method in the scheduler of master.

The scheduler method of the Executorallocationmanager class is called if the resource is allocated dynamically by configuration:

/**
* This was called at a fixed interval to regulate the number of pending executor requests
* and number of executors running.
*
* First, adjust our requested executors based in the add time and our current needs.
* Then, if the remove time for a existing executor has expired, kill the executor.
*
* This is factored-in its own method for testing.
*/
Private DefSchedule (): Unit = synchronized {
Valnow =Clock. gettimemillis

Updateandsyncnumexecutorstarget (now)

Removetimes. retain { Case(Executorid, expiretime) =
Valexpired = Now >= expiretime
if(expired) {
initializing=false
Removeexecutor (Executorid)
}
!expired
}
}

The internal methods are periodically triggered scheduler and periodically executed.

Keep Executorid, keep registering executor.

/**
* Register for scheduler callbacks to decide when to add and remove executors, and start
* The scheduling task.
*/
defStart (): Unit = {
Listenerbus.addlistener (Listener)

ValScheduleTask =NewRunnable () {
Override DefRun (): Unit = {
Try{
Schedule ()
}Catch{
 CaseCt:controlthrowable =
ThrowCt
 CaseT:throwable =
Logwarning (S "uncaught exception in thread${Thread.CurrentThread(). GetName} ", T)
}
}
}
Executor. Scheduleatfixedrate (scheduleTask, 0,Intervalmillis, Timeunit.MILLISECONDS)
}

From the adjustment cycle angle, the batchduration angle adjusts, 10 seconds, is increases the executor or reduces the executor, needs to the data scale appraisal, has the resource appraisal, has made the assessment to the existing resources idle, for example whether decides needs more resources, Data in the Batchduration stream will have data shards, each data shard processing needs to be more than cores, if not enough to apply with many executors.

SS provides the elastic mechanism, see the speed of the slip in and processing speed relationship, whether time to deal with, too late to deal with the word will dynamically control the speed of data inflow, here is a control rate parameter: ss. The backpressuareenable parameter.

The Spark streaming itself has control over the Ratecontroller and manually controls the incoming speed at run time. If delay, then control the speed, flow into the slow point, need to adjust the incoming data and processing time proportional relationship.

Liaoliang Teacher's card:

China Spark first person

Thank Liaoliang teacher for their knowledge sharing

Sina Weibo: Http://weibo.com/ilovepains

Public Number: Dt_spark

Blog: http://blog.sina.com.cn/ilovepains

Mobile: 18610086859

qq:1740415547

Email: [Email protected]

YY classroom: Daily 20:00 live teaching channel 68917580

Spark Streaming Release note 17: Dynamic allocation of resources and dynamic control of consumption rates

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More