Introduction
In the previous section, "Stage generation and stage source analysis," I introduced the stage generation division into the process of submitting the stage, the analysis finally boils down to the submitstage recursive submission stage, A task collection is created and distributed through the Submitmissingtasks function.
In the next few articles, I will specifically describe the task creation and distribution process, in order to make the logic clearer, I will be divided into several articles to introduce, to ensure concise and clear, logical coherence, and unified.
TaskScheduler Introduction
The main task of TaskScheduler is to submit the Taskset to the cluster operation and report the results.
In specific terms:
* Shuffle output lost to report fetch failed error
* Encountered Straggle task needs to be put on another node retry
* Maintain a Tasksetmanager for each taskset (tracking local and error messages)
TaskScheduler Create
In the article "Sparkcontext Source code interpretation," I introduced the creation of TaskScheduler and Dagscheduler when Sparkcontext was initialized. Here is a detailed description of its creation process.
The Createtaskscheduler function is called during Sparkcontext creation to launch the TaskScheduler Task Scheduler:
// Create and start the scheduler privatevar (schedulerBackend, taskScheduler) = SparkContext.createTaskScheduler(this, master)
In the Createtaskscheduler function, TaskScheduler chooses different schedulerbackend to handle depending on how it is deployed.
Different TaskScheduler and schedulerbackend are combined for different deployment methods:
- Local mode: Taskschedulerimpl + localbackend
- Spark Cluster mode: Taskschedulerimpl + sparkdepolyschedulerbackend
- Yarn-cluster mode: Yarnclusterscheduler + coarsegrainedschedulerbackend
- Yarn-client mode: Yarnclientclusterscheduler + yarnclientschedulerbackend
/** * Create a Task Scheduler based on a given master URL. * Return a 2-tuple of the scheduler backend and the Task Scheduler. */ Private defCreatetaskscheduler (Sc:sparkcontext, master:string): (Schedulerbackend, TaskScheduler) = {//Regular expression used for local[n] and local[*] Master formats ValLocal_n_regex ="" " local\[([0-9]+|\*) \] " "". R//Regular expression for Local[n, maxretries], used in tests with failing tasks ValLocal_n_failures_regex ="" " local\[([0-9]+|\*) \s*,\s* ([0-9]+) \] " "". R//Regular expression for simulating a Spark cluster of [N, cores, memory] locally ValLocal_cluster_regex ="" " local-cluster\[\s* ([0-9]+) \s*,\s* ([0-9]+) \s*,\s* ([0-9]+] \s*] " "". R//Regular expression for connecting to Spark deploy clusters ValSpark_regex ="" " spark://(. *) " "". R//Regular expression for connection to Mesos cluster by mesos://or zk://URL ValMesos_regex ="" " (MESOS|ZK)://.*" "". R//Regular expression for connection to SIMR cluster ValSimr_regex ="" " simr://(. *) " "". R//When running locally, don ' t try to re-execute tasks on failure. ValMax_local_task_failures =1MasterMatch{ Case "Local"=ValScheduler =NewTaskschedulerimpl (SC, max_local_task_failures, isLocal =true)ValBackend =NewLocalbackend (Scheduler,1) Scheduler.initialize (Backend) (backend, scheduler) CaseLocal_n_regex (threads) = ... CaseLocal_n_failures_regex (threads, maxfailures) = ... CaseSpark_regex (Sparkurl) =ValScheduler =NewTaskschedulerimpl (SC)ValMasterurls = Sparkurl.split (","). Map ("spark://"+ _)ValBackend =NewSparkdeployschedulerbackend (Scheduler, SC, masterurls) scheduler.initialize (backend) (backend, scheduler) CaseLocal_cluster_regex (Numslaves, Coresperslave, memoryperslave) =//Check to make sure memory requested <= Memoryperslave. Otherwise Spark would just hang. ValMemoryperslaveint = Memoryperslave.tointif(Sc.executormemory > Memoryperslaveint) {Throw NewSparkexception ("asked to launch cluster and%d MB Ram/worker but requested%d Mb/worker". Format (Memoryperslaveint, sc.executormemory))}ValScheduler =NewTaskschedulerimpl (SC)ValLocalcluster =NewLocalsparkcluster (Numslaves.toint, Coresperslave.toint, Memoryperslaveint, sc.conf)ValMasterurls = Localcluster.start ()ValBackend =NewSparkdeployschedulerbackend (Scheduler, SC, masterurls) scheduler.initialize (backend) Backend.shutdowncallbac K = (backend:sparkdeployschedulerbackend) = {localcluster.stop ()} (backend, scheduler) .....
Take the standalone pattern as an example, backend is instantiated according to different deployment methods, and then as a member variable of the Scheduler object, the Initialize function is called to Scheduler:
case SPARK_REGEX(sparkUrl) => valnew TaskSchedulerImpl(sc) val masterUrls = sparkUrl.split(",").map("spark://" + _) valnew SparkDeploySchedulerBackend(scheduler, sc, masterUrls) scheduler.initialize(backend) (backend, scheduler)
The relationship between TaskScheduler, Taskschedulerimpl and Schedulerbackend
The TaskScheduler class is responsible for the allocation of task scheduling resources, and Schedulerbackend is responsible for the resources assigned to the application by the master, Worker communication collection worker.
The UML relationship between TaskScheduler, Taskschedulerimpl and Schedulerbackend is described, in which Taskschedulerimpl is the concrete implementation of task Schduler, The TaskScheduler traits are mixed, and the specific resource collection classes such as Sparkdeployschedulerbackend inherit from the parent class Coarsegrainedschedulerbackend. and coarsegrainedschedulerbackend mixed with schedulerbackend traits:
This also takes the Spark standalone cluster mode as an example to analyze the specific operations in the Taskschedulerimpl and Sparkdepolyschedulerbackend classes.
Resource Information Collection
The Sparkdepolyschedulerbackend class is specifically responsible for collecting the resource information of the worker, The driveractor in its parent class Coarsegrainedschedulerbackend is the actor that communicates with the worker.
When the worker starts, it sends a registerexecutor message to driver, which contains the compute resource information that executor assigned to application, and the actor that receives the message is Driveractor.
Resource Allocation
The Taskschedulerimpl class is responsible for assigning resources to tasks. After Coarsegrainedschedulerbackend obtains the available resources, the resource is makeOffers
allocated through method notification Taskschedulerimpl, Taskschedulerimpl resourceOffers
The method is responsible for assigning a compute resource to a task, and then sending a launchtask message to the worker on the Lauchtasks method after assigning a resource to the task to notify the executor of the task.
TaskScheduler creating a chain of function calls
Sparkcontext createTaskScheduler
Create schedulerbackend and taskscheduler–> Select specific Scheduler and backend constructors according to different scheduling methods –> The method that calls Taskschedulerimpl is initialize
assigned a scheduler member variable backend –> createTaskScheduler
returns a created (schedulerBackend, taskScheduler)
–> call TaskScheduler.start()
start –> It is actually called in the start method of the Taskschedulerimpl backend.start()
to start the schedulerbackend.
TaskScheduler is in the application execution process, for it task scheduling, belongs to the driver side. Corresponds to a application there will be a taskscheduler,taskscheduler and application is one by one corresponding. TaskScheduler control of resources is also relatively robust, a application to apply for the worker's computing resources, as long as the application does not end will always be occupied.
Summary
In this article, we introduce the creation process of TaskScheduler, the relationship between TaskScheduler, Taskschedulerimpl, Schedulerbackend, and the call chain of the creation process, giving everyone an initial impression. In the next article, I will undertake the task creation and distribution process after the stage has been divided, and detailed introduction.
Resources
Spark Source Analysis (iii)-taskscheduler creation
reprint Please indicate the author Jason Ding and its provenance
Gitcafe Blog Home page (http://jasonding1354.gitcafe.io/)
GitHub Blog Home page (http://jasonding1354.github.io/)
CSDN Blog (http://blog.csdn.net/jasonding1354)
Jane Book homepage (http://www.jianshu.com/users/2bd9b48f6ea8/latest_articles)
Google search jasonding1354 go to my blog homepage
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
"Spark Core" TaskScheduler source code and task submission principle Analysis 1