Spark Scheduler module (bottom)

Source: Internet
Author: User

The two most important classes in the Scheduler module are Dagscheduler and TaskScheduler. On the Dagscheduler, this article speaks of TaskScheduler.

TaskScheduler

As mentioned earlier, in the process of sparkcontext initialization, different implementations of TaskScheduler are created based on the type of master. When Master creates Taskschedulerimpl for local, Spark, Mesos, and when Master is YARN, other implementations are created, which the reader can study on its own.

Master Match { Case "Local"=Val Scheduler=NewTaskschedulerimpl (SC, max_local_task_failures, isLocal =true) Val Backend=NewLocalbackend (sc.getconf, Scheduler,1) Scheduler.initialize (backend) (backend, scheduler) CaseSpark_regex (Sparkurl) =Val Scheduler=NewTaskschedulerimpl (SC) Val masterurls= Sparkurl.split (","). Map ("spark://"+_) Val Backend=Newsparkdeployschedulerbackend (Scheduler, SC, masterurls) scheduler.initialize (backend) (backend, scheduler)  CaseMesosurl @ Mesos_regex (_) =Mesosnativelibrary.load () Val Scheduler=NewTaskschedulerimpl (SC) Val coarsegrained= Sc.conf.getBoolean ("Spark.mesos.coarse",false) Val URL= Mesosurl.stripprefix ("mesos://")//strip scheme from raw Mesos URLsVal backend =if(coarsegrained) {Newcoarsemesosschedulerbackend (Scheduler, SC, URL, sc.env.securityManager)}Else {      Newmesosschedulerbackend (Scheduler, SC, URL)} scheduler.initialize (Backend) (backend, Scheduler) ....}

At this time the attentive reader will have doubts, TaskScheduler need to dispatch the task on different resource management platform (local, Spark, Mesos), how can you use the same taskschedulerimpl? Note that there is a very important member of backend. Each master corresponds to a different backend, which is the backend responsible for communicating with the resource management platform.

Because this level of scheduling, need to communicate with the resource manager, so it will also be part of the deployment module and the content of the executor module. Because the local mode is too simplistic (locally initiated multithreading Task), and YARN and Mesos require programming interface-related background knowledge, here we choose Sparkdeployschedulerbackend to focus on analysis. This is a resource management system that is implemented by Spark itself, which some readers may have built and used.

Taskschedulerimpl Start-up

In Sparkcontext (the code above), the Taskschedulerimpl and Sparkdeployschedulerbackend are created first, and backend is passed into the Taskschedulerimpl. Taskschedulerimpl was then started.

 //  Create and start the scheduler  Val (sched, TS) = Sparkcontext.createtaskscheduler (this  , master) _schedulerbackend  = sched_ TaskScheduler  = Ts_dagscheduler  = new  Dagscheduler (this  ) _ Heartbeatreceiver.ask[boolean] (taskschedulerisset)  //  start TaskScheduler after TaskScheduler sets Dagscheduler reference in Dagscheduler ' s  //  constructor  _taskscheduler.start () 

The Taskschedulerimpl.start method is primarily called Backend.start (). Sparkdeployschedulerbackend starts by calling the Start method of the parent class Coarsegrainedschedulerbackend, which creates a driverendpoint, which is a local driver, Communicate with other executor in the same way as RPC.

Coarsegrainedschedulerbackend.scala
Override def start () { = rpcenv.setupendpoint ( new Driverendpoint ( Rpcenv, properties)}
//Sparkdeployschedulerbackend.scalaOverridedef start () {Super.start ()//The endpoint for executors to talk to usVal Driverurl =rpcenv.uriof (Sparkenv.driveractorsystemname, Rpcaddress (sc.conf.Get("Spark.driver.host"), sc.conf.Get("Spark.driver.port"). ToInt), Coarsegrainedschedulerbackend.endpoint_name) Val args=Seq ("--driver-url", Driverurl,"--executor-id","{{executor_id}}",    "--hostname","{{HOSTNAME}}",    "--cores","{{cores}}",    "--app-id","{{app_id}}",    "--worker-url","{{Worker_url}}") .... val command= Command ("Org.apache.spark.executor.CoarseGrainedExecutorBackend", args, Sc.executorenvs, classpathentries++Testingclasspath, Librarypathentries, javaopts) Val Appdesc=Newapplicationdescription (Sc.appname, Maxcores, sc.executormemory, command, Appuiaddress, Sc.eventlogdir, sc.event Logcodec, coresperexecutor) client=NewAppclient (SC.ENV.RPCENV, Masters, Appdesc, This, conf) Client.start () waitforregistration ()}

This creates a client:appclient, which is connected to Masters (spark://master:7077), with a command to start the executor, to change commands, and to Driver-url parameters. This executor automatically connects to the driver when it is started.

At this point, the boot process for Taskschedulerimpl and Sparkdeployschedulerbackend has been completed. Mainly do two things, start local driver, notify Masters start executors. and RPC communication is used by driver and executors.

Note: For executor how to start, wait for the analysis of the deploy and executor modules, and then carefully analyzed.

Taskschedulerimpl Submitting a task

In the previous article, we said that Dagscheduler finally called the Taskscheduler.submittasks commit task. The following article continues the analysis:

 override   def Submittasks (Taskset:taskset {Val tasks  = taskset.tasks  this  Span style= "color: #000000;" >.synchronized {Val Manager  = Createtasksetmanager (TaskSet, Maxtaskfailures) v Al stage  = Taskset.stageid val stagetasksets  = Tasksetsbystageidandattempt.getorelseupdate (stage,  new   Hashmap[int, Tasksetmanager]) stagetasksets (taskset.stageattemptid)  = Manager Schedulablebuilder.addtasksetmanager (Manager, Manager.taskSet.properties)//Add task to variable Rootpool} backend.re Viveoffers ()}  

The TaskSet is then packaged into Tasksetmanager and added to the Schedulablebuilder. By the way, Schedulablebuilder is a scheduling policy implementation of Spark, with FIFO and FAIR two, the default is FIFO. They eventually put the Tasksetmanager in the Rootpool.

The backend.reviveoffers is then called, and there is a more abrupt call relationship. Because Sparkdeployschedulerbackend has no method reviveoffers, it is called with the same method as the parent class Coarsegrainedschedulerbackend. There is only one row in the coarsegrainedschedulerbackend.reviveoffers implementation, i.e.

Override def reviveoffers () {  driverendpoint.send (reviveoffers)}

The Makeoffers method is called when Driverendpoint receives the reviceoffers:

Private def makeoffers () {  //  Filter out executors under killing  val activeexecutors = execut Ordatamap.filterkeys (!  Executorspendingtoremove.contains (_))  case (ID, executordata) =        new  Workeroffer (ID, executordata.executorhost, executordata.freecores)  }.toseq  launchtasks ( Scheduler.resourceoffers (workoffers))}

The entire invocation relationship is taskscheduler.submittasks (), Coarsegrainedschedulerbackend.reviveoffers (), Rpcendpointref.send ( reviveoffers)->driverendpoint.reviceoffers-driverendpoint.makeoffers

This calls Taskschedulerimpl's method resourceoffers, which allocates compute resources to the task. The coarsegrainedschedulerbackend.launchtasks is then called, and the compute task is really sent to executor.

//Launch tasks returned by a set of resource offersPrivatedef launchtasks (Tasks:seq[seq[taskdescription]) { for(Task <-Tasks.flatten) {val Serializedtask= Ser.serialize (Task)//Serializing a task    if(Serializedtask.limit >= Akkaframesize-akkautils.reservedsizebytes) {//serialization result exceeds upper limit, alarm    }    Else{val Executordata= Executordatamap (Task.executorid)//Select a executor to execute a taskExecutordata.freecores-= scheduler. Cpus_per_task//Tagged executor CPU resources are taking up part of theExecutorData.executorEndpoint.send (Launchtask (NewSerializablebuffer (Serializedtask)))//Sending execution task information to the executor RPC server    }  }}

As mentioned above, when the Taskschedulerimpl started, Masters also started the executor, the specific starting method is org.apache.spark.executor.CoarseGrainedExecutorBackend. So the method that executor accepts the message is also in this class:

//Org.apache.spark.executor.CoarseGrainedExecutorBackendOverridedef receive:partialfunction[any, Unit] = {   CaseLaunchtask (data) =if(Executor = =NULL) {LogError ("Received launchtask Command But executor is null") System.exit (1)    } Else{val Taskdesc= Ser.deserialize[taskdescription] (data.value)//Send serializationLoginfo ("Got assigned Task"+taskdesc.taskid) Executor.launchtask ( This, taskId = taskdesc.taskid, Attemptnumber =Taskdesc.attemptnumber, Taskdesc.name, Taskdesc.serializedtask)} ...}

Executor Execute Task:

// Org.apache.spark.executor.Executor def launchtask (    context:executorbackend,    Taskid:long,    attemptnumber:int,    taskname:string,     = {  new//  Create thread execution task  runningtasks.put (taskId, TR)  Threadpool.execute (TR)}

Taskrunner uses ClassLoader to load a Task from a byte and executes the resulting result, serializing the result using RPC return.

//Org.apache.spark.executor.Executor.TaskRunnerOverridedef run (): Unit ={execbackend.statusupdate (taskId, taskstate.running, Empty_byte_buffer)//Executor notifies driver that a task is being executed  Try{val (taskfiles, Taskjars, taskbytes)= Task.deserializewithdependencies (Serializedtask)//Deserialization of dependent files, jar packages, task itselfupdatedependencies (taskfiles, taskjars) Task=Ser.deserialize[task[any]] (taskbytes, Thread.currentThread.getContextClassLoader) Task.settaskmemorymanager ( Taskmemorymanager) Val (value, accumupdates)=Try{val res= Task.run (Taskattemptid = taskId, Attemptnumber = attemptnumber, Metricssystem = Env.metricssystem)//Execute TaskRes}finally {      ...    } Val Resultser=env.serializer.newInstance () Val valuebytes= Resultser.serialize (value)//Serialization Resultsexecbackend.statusupdate (TaskId, taskstate.finished, Serializedresult)//returns the result to driver in RPC mode}Catch {  ...}

There is a general description of the process that the task runs within Taskschedulerimpl. It's a little too much of a branch, but it doesn't affect the reader's understanding of the overall process.

Summarize

Spark Scheduler module up and down two the scheduling logic for Spark has a general description of the order in which it is executed.

The code architecture of the Scheduler module fully embodies the design philosophy of layering and isolation. First of all, Dagscheduler is the unique logic of Spark, and TaskScheduler is different because of resource scheduler, so the dispatch part is separated into these two parts, the former need only one implementation, and the latter can be implemented on different platforms. Even with TaskScheduler, there are commonalities on multiple platforms, so Taskschedulerimpl is also a more general implementation, except that the communication section of the resource scheduler uses a different backend.

Spark Scheduler module (bottom)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.