Driver start-up in cluster mode, two different resource scheduling methods source thorough analysis, Resource Scheduling insider summary (DT Big Data DreamWorks)

Last Update:2016-02-21 Source: Internet

Author: User

Tags shuffle

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Content:

1. Allocation of Driver (Cluster);

2, allocate resources for application;

3. Two different ways of allocating resources are completely decrypted;

4, spark resource allocation of thinking;

Spark is the most important thing that every IMF member must master, and the performance optimizations behind it all have to do with this.

The difference between ========== task scheduling and resource scheduling ============

1, task scheduling is through Dagscheduler, TaskScheduler, schedulerbackend and other job scheduling;

2, resource scheduling refers to how the application obtains resources;

3, task scheduling is on the basis of resource scheduling, no resource scheduling, task scheduling can not talk about, became water without, tree without roots ;

4, the Spark resource scheduling algorithm method is: Schedule ()

========== Resource Dispatch insider secret decryption ============

1, because Master is responsible for resource management and scheduling, so the method of resource scheduling schedule is located in the Master.scala class, when the registration program or resource changes, will lead to schedule calls, such as when registering the program:

Case RegisterApplication(Description, Driver) + = {
// TODO Prevent repeated registrations from some driver
if( State= = Recoverystate.STANDBY){
Ignore, don ' t send response
}Else{
Loginfo ("Registering App"+ description.name)
ValApp = createapplication (description, Driver
RegisterApplication (APP)
Loginfo ("registered app"+ Description.name +"with ID"+ app.id)
Persistenceengine. Addapplication (APP)
Driver.send (registeredapplication(app.id, Self))
Schedule ()
}
}

2, schedule () Call timing: Each time a new application or cluster resource state changes (including executor increase or decrease, worker increase or decrease, etc.);

3, the current master must be alive State in order to schedule resources, if not alive state, will be directly returned, that is, Standby master does not make application resource calls;

4. Using random.shuffle to randomly disrupt the information of all the workers in the cluster retained in master, the internal algorithm is the position of all the workers in the master cache data structure in the loop random exchange;

5, next to determine which workers in all the worker is alive level of worker,alive to participate in the allocation of resources;

6. When spark submit specifies driver in cluster mode, the driver will be added to the waitingdrivers wait list, In the driverdescription of each driver drvierinfo, there are requirements for the worker's memory and cores to start driver (this driver, if supervise is set, Then the drvier can be restarted automatically after hanging up);

Private[Deploy]Case Classdriverdescription(
Jarurl:String,
Mem:Int,
Cores:Int,
Supervise:Boolean,
Command:command) {

Override DeftoString:String=S "Driverdescription (${Command.mainclass})"
}

7, and then in accordance with the requirements of the resources on the basis of a randomly disturbed a worker to start the driver,master command to remote workers let the remote worker start driver, and then driver state programming running;

PRIVATE&NBSP;DEF&NBSP; (Worker: workerinfo,&NBSP; driver: driverinfo) {
Loginfo ( + Driver.id + + worker.id)
Worker.adddriver (driver)
Driver. WORKER&NBSP, = launchdriver (Driver.id,&NBSP; driver.desc))// Master sends instructions to the worker to start the corresponding driver
driver. = driverstate. running
}

8, the first start drvier will occur after all the resource scheduling mode;

/**
* Schedule the currently available resources among waiting apps. This method would be called
* Every time a new app joins or resource availability changes.
*/
Private DefSchedule():Unit= {
if( State! = Recoverystate.ALIVE) {return}
//Drivers take strict precedence over executors
ValShuffledworkers = Random.shuffle (Workers)//randomization helps balance drivers
for(Worker <-ShuffledworkersifWorker. State= = Workerstate.ALIVE) {
for(Driver <-waitingdrivers) {
if(Worker.memoryfree >= driver.desc.mem && worker.coresfree >= driver.desc.cores) {
Launchdriver (worker, Driver
waitingdrivers-= Driver
}
}
}
Startexecutorsonworkers ()
}

8, spark by default for the application to start the executor way is the FIFO (first-out, queued) way, that is, all submitted applications are placed in the waiting queue of the schedule, first-out, only to meet the previous application resource allocation based on To meet the allocation of the next application resource;

9, before the application specific allocation executor to determine whether the application also need to allocate cores, if not required, will not assign executor to the application;

10, the specific allocation of executor before the requirements of the worker must be alive state and must meet the application of each executor memory and cores requirements, and on this basis to sort, put cores more in front

In FIFO case the default is Spreadoutapps to let the application run as much as possible on all node

11. There are two ways to assign executors to an application, the first is to allocate executor on all workers in the cluster as much as possible, which often leads to potentially better data locality;

/**
* Schedule and launch executors on workers
*/
Private Defstartexecutorsonworkers():Unit= {
very simple FIFO scheduler. We keep trying to fit in the first app
In the queue, then the second app, etc.
for(App <-WaitingappsifApp.coresleft >0) {
Valcoresperexecutor:option[Int] = App.desc.coresPerExecutor
Filter out workers that don ' t has enough resources to launch an executor
ValUsableworkers =Workers. Toarray.filter (_. State= = Workerstate.ALIVE)
. Filter (worker = Worker.memoryfree >= App.desc.memoryPerExecutorMB &&
Worker.coresfree >= Coresperexecutor.getorelse (1))
. SortBy (_.coresfree). Reverse
ValAssignedcores = scheduleexecutorsonworkers (app, Usableworkers, Spreadoutapps)

//Now so we ' ve decided how many cores to allocate on each worker, let's allocate them
for(Pos <-0Until Usableworkers.lengthifAssignedcores (POS) >0) {
Allocateworkerresourcetoexecutors (
App, Assignedcores (POS), Coresperexecutor, Usableworkers (POS))
}
}
}

12, the specific in the cluster allocation cores, will be as far as possible to meet our requirements;

13. If each worker is only able to assign a executor to the current application, assign only one core! at a time

var corestoassign = Math. min (App.coresleft, usableworkers.map (_.coresfree). Sum)

If We are launching one executor per worker, then every iteration assigns 1 core
to the executor. Otherwise, every iteration assigns cores to a new executor.
if (Oneexecutorperworker) {
Assignedexecutors (POS) = 1
} Else {
Assignedexecutors (pos) + = 1
}

Assuming that the 4 worker,spreadout time, will round round for the executor distribution core, one, the loop allocation, until the resource exhaustion

14, then is allocated, ready to specifically for the current application allocation of executor information, the specific master to send instructions through the remote communication to the worker to specifically start the executorbackend process;

Now the we ' ve decided how many cores to allocate on each worker and let's allocate them
for (Pos <- 0 until Usableworkers.length if assignedcores (POS) > 0) {
Allocateworkerresourcetoexecutors (
App, assignedcores (POS), coresperexecutor, usableworkers (POS))
}

/**
* Allocate A worker ' s resources to one or more executors.
* @param appThe info of the application which the executors belong to
* @param assignedcoresNumber of cores on this worker for this application
* @param coresperexecutorNumber of cores per executor
* @param workerThe worker Info
*/
Private Defallocateworkerresourcetoexecutors(
App:applicationinfo,
Assignedcores:Int,
coresperexecutor:option[Int],
Worker:workerinfo):Unit= {
//If the number of cores per executor is specified, we divide the cores assigned
//To this worker evenly among the executors with no remainder.
Otherwise, we launch a single executor that grabs all the assignedcores on this worker.
ValNumexecutors = Coresperexecutor.map {assignedcores/_}.getorelse (1)
ValCorestoassign = Coresperexecutor.getorelse (assignedcores)
for(I <-1To Numexecutors) {
Valexec = App.addexecutor (worker, Corestoassign)
Launchexecutor (worker, exec
App. State= ApplicationState.RUNNING
}
}

15. Send a executoradded message to the driver of our application immediately.

Private DefLaunchexecutor(Worker:workerinfo, EXEC:EXECUTORDESC):Unit= {
Loginfo ("Launching Executor"+ Exec.fullid +"on worker"+ worker.id)
Worker.addexecutor (EXEC)
Worker.endpoint.send (Launchexecutor(MasterUrl,
Exec.application.id, Exec.id, Exec.application.desc, Exec.cores, Exec.memory))
Exec.application.driver.send (
executoradded(exec.id, Worker.id, Worker.hostport, Exec.cores, Exec.memory))
}

Liaoliang Teacher's card:

China Spark first person

Sina Weibo: Http://weibo.com/ilovepains

Public Number: Dt_spark

Blog: http://blog.sina.com.cn/ilovepains

Mobile: 18610086859

qq:1740415547

Email: [Email protected]

This article from "a Flower proud Cold" blog, declined reprint!

Driver start-up in cluster mode, two different resource scheduling methods source thorough analysis, Resource Scheduling insider summary (DT Big Data DreamWorks)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More