Task Scheduler in Spark: start from Sparkcontext

Source: Internet
Author: User

Sparkcontext This is a developed country spark admissions application, it is responsible for interacting with the entire cluster and it involves creating an rdd. accumulators and broadcast variables. Understanding the spark architecture, we need to start with the portal. is the official website of the chart.


Driverprogram is a user-submitted program, where an instance of Sparkcontext is defined.

Sparkcontext defined in Core/src/main/scala/org/apache/spark/sparkcontext.scala.

The default constructor for Spark accepts Org.apache.spark.SparkConf, which allows us to define the parameters for this commit, which overrides the system's default configuration.

First, a class diagram related to Sparkcontext:


The following are definitions of data members that are important to Sparkcontext:

  Create and start the scheduler  Private[spark] var TaskScheduler = Sparkcontext.createtaskscheduler (this, master) C2/>private val heartbeatreceiver = env.actorSystem.actorOf (    Props (New Heartbeatreceiver (TaskScheduler)), " Heartbeatreceiver ")  @volatile Private[spark] var Dagscheduler:dagscheduler = _  try {    Dagscheduler = new Dagscheduler (This)  } catch {case    e:exception = = Throw      new Sparkexception ("Dagscheduler cannot be Initialized due to%s '. Format (e.getmessage)  }  //Start TaskScheduler after TaskScheduler sets Dagscheduler Reference in Dagscheduler ' s  //constructor  Taskscheduler.start ()

With Createtaskscheduler, we are able to get schedulers of different resource management types or deployment types.

Take a look at the deployment methods supported today:

 /** creates a task scheduler based on a given master URL. Extracted for testing.  */Private Def createtaskscheduler (Sc:sparkcontext, master:string): TaskScheduler = {//Regular expression used for Local[n] and local[*] Master formats val Local_n_regex = "" "local\[([0-9]+|\*) \]" "". R//Regular expression for loc Al[n, Maxretries], used in tests with failing tasks val Local_n_failures_regex = "" "local\[([0-9]+|\*) \s*,\s* ([0-9]+)] "" ". R//Regular expression for simulating a Spark cluster of [N, cores, memory] locally val Local_cluster_regex =" "" local-cluster\[\s* ([0-9]+) \s*,\s* ([0-9]+) \s*,\s* ([0-9]+] \s*] "" ". R//Regular expression for connecting to Spark DEPL  Oy clusters val spark_regex = "" "spark://(. *)" "". R//Regular expression for connection to Mesos cluster by mesos:// Or zk://URL val mesos_regex = "" "(MESOS|ZK)://.*" "". R//Regular expression for connection to Simr cluster Val Simr_regex = "" "simr://(. *)" "". R//When running locally, don 'T try to re-execute the tasks on failure.  Val max_local_task_failures = 1 Master Match {case "LOCAL" = Val Scheduler = new Taskschedulerimpl (SC, Max_local_task_failures, isLocal = True) val backend = new Localbackend (scheduler, 1) scheduler.initialize ( Backend) Scheduler case Local_n_regex (threads) = def Localcpucount = Runtime.getRuntime.availablePr Ocessors ()//local[*] estimates the number of cores on the machine;        Local[n] uses exactly N threads. Val threadcount = if (threads = = "*") Localcpucount else threads.toint val Scheduler = new Taskschedulerimpl (SC, MA X_local_task_failures, isLocal = True) val backend = new Localbackend (scheduler, ThreadCount) scheduler.init Ialize (Backend) Scheduler case Local_n_failures_regex (threads, maxfailures) = def Localcpucount = R Untime.getRuntime.availableProcessors ()//local[*, M] means the number of cores on the computer with MFailures//local[n, M] means exactly N threads with M failures val threadcount = if (threads = = "*") Localc Pucount Else threads.toint val Scheduler = new Taskschedulerimpl (SC, maxfailures.toint, isLocal = True) val backend = new Localbackend (scheduler, ThreadCount) scheduler.initialize (Backend) Scheduler case Spark_r Egex (Sparkurl) = Val Scheduler = new Taskschedulerimpl (SC) val masterurls = Sparkurl.split (","). Map ("Spa rk://"+ _" val backend = new Sparkdeployschedulerbackend (Scheduler, SC, masterurls) scheduler.initialize (BA Ckend) Scheduler Case Local_cluster_regex (numslaves, Coresperslave, memoryperslave) = =//Check to M Ake sure memory requested <= Memoryperslave.        Otherwise Spark would just hang. Val memoryperslaveint = Memoryperslave.toint if (sc.executormemory > Memoryperslaveint) {throw new Spa Rkexception ("Asked to launch cluster with%d MB Ram/worker but requested%d Mb/worker ". Format (Memoryperslaveint, Sc.executormemory))} Val Scheduler = new Taskschedulerimpl (SC) val localcluster = new Localsparkcluster (Numslaves.toint, Corespers Lave.toint, Memoryperslaveint) val masterurls = Localcluster.start () val backend = new Sparkdeployschedulerb Ackend (Scheduler, SC, masterurls) scheduler.initialize (backend) Backend.shutdowncallback = (backend:sparkde Ployschedulerbackend) = {localcluster.stop ()} Scheduler Case "Yarn-standalone" | "Yarn-cluster" = if (master = = "Yarn-standalone") {logwarning ("\" Yarn-standalone\ "is DEP Recated as of Spark 1.0.        Use \ "Yarn-cluster\" instead. ")}          Val Scheduler = try {val clazz = class.forname ("Org.apache.spark.scheduler.cluster.YarnClusterScheduler") Val cons = Clazz.getconstructor (Classof[sparkcontext]) Cons.newinstaNCE (SC). Asinstanceof[taskschedulerimpl]} catch {//todo:enumerate The exact reasons why it can fail          But irrespective of it, it means we cannot proceed! Case e:exception = {throw new Sparkexception ("YARN mode not available?", E)}} V Al backend = try {val clazz = Class.forName ("Org.apache.spark.scheduler.cluster.YarnClusterSchedulerB Ackend ") Val cons = Clazz.getconstructor (Classof[taskschedulerimpl], Classof[sparkcontext]) cons.newinst            ance (Scheduler, SC). Asinstanceof[coarsegrainedschedulerbackend]} catch {case e:exception + = {        throw new Sparkexception ("YARN mode not available?", E)}} scheduler.initialize (Backend) Scheduler case "Yarn-client" = Val Scheduler = try {val clazz = Class.forName ("or G.apache.spark.scheduler.cluster.yarnclientclusterscheduler ") ValCons = Clazz.getconstructor (Classof[sparkcontext]) cons.newinstance (SC). Asinstanceof[taskschedulerimpl]} Catch {case E:exception + = {throw new Sparkexception ("YARN mode not available?

", E)}} val backend = try {val clazz = Class.forName (" org.apache.spark.sched Uler.cluster.YarnClientSchedulerBackend ") Val cons = Clazz.getconstructor (Classof[taskschedulerimpl], Classof[spa Rkcontext]) cons.newinstance (Scheduler, SC). Asinstanceof[coarsegrainedschedulerbackend]} catch { Case e:exception = {throw new Sparkexception ("YARN mode not available?", E)}} Scheduler.initialize (Backend) Scheduler case Mesosurl @ Mesos_regex (_) = Mesosnativelibrary.load () Val Scheduler = new Taskschedulerimpl (SC) val coarsegrained = Sc.conf.getBoolean ("Spark.mesos.coarse", False Val url = mesosurl.stripprefix ("mesos://")//strip scheme from raw Mesos urls val backend = if (Coarsegra ined) {New Coarsemesosschedulerbackend (Scheduler, SC, URL)} else {new Mesosschedulerbackend (SC Heduler, SC, URL) } scheduler.initialize (Backend) Scheduler case Simr_regex (simrurl) = Val Scheduler = new Ta Skschedulerimpl (SC) val backend = new Simrschedulerbackend (Scheduler, SC, Simrurl) scheduler.initialize (back End) Scheduler Case _ = + throw new Sparkexception ("Could not parse Master URL: '" + Master + "'") } }}


The basic logic starts with line 20. The Scheduler and Scheduler backend are generated primarily through the incoming master URL. For common standalone deployment methods, let's look at the generatedScheduler and Scheduler backend:

      Case Spark_regex (Sparkurl) =        val Scheduler = new Taskschedulerimpl (SC)        val masterurls = Sparkurl.split (",") . Map ("spark://" + _)        val backend = new Sparkdeployschedulerbackend (Scheduler, SC, masterurls)        Scheduler.initialize (Backend)        Scheduler

Org.apache.spark.scheduler.TaskSchedulerImpl manages all of the cluster's schedules through a single schedulerbackend, which mainly implements the general logic. For the system to start, you need to understand two interfaces, one is initialize, and the other is start.

This is also called when the Sparkcontext is initialized:

  def initialize (backend:schedulerbackend) {    this.backend = backend    //temporarily set Rootpool name to empty    Rootpool = new Pool ("", Schedulingmode, 0, 0)    Schedulablebuilder = {      Schedulingmode match {case        Schedulingmo De. FIFO =          new Fifoschedulablebuilder (rootpool) case        Schedulingmode.fair =          New Fairschedulablebuilder (Rootpool, conf)      }    }    schedulablebuilder.buildpools ()  }

This shows that the initialization is mainly schedulerbackend initialization. It is mainly through the configuration of the cluster to obtain the scheduling mode, and today the supported scheduling mode is FIFO and fair dispatch, the default is FIFO.
Default scheduler is FIFO  private val schedulingmodeconf = Conf.get ("Spark.scheduler.mode", "FIFO")  Val Schedulingmode:schedulingmode = try {    schedulingmode.withname (schedulingmodeconf.touppercase)  } catch { Case    e:java.util.nosuchelementexception =      throw new Sparkexception (S "Unrecognized spark.scheduler.mode : $schedulingModeConf ")  }

The implementation of START is as follows:

  Override Def start () {    Backend.start ()    if (!islocal && conf.getboolean ("Spark.speculation", false)) { C3/>loginfo ("Starting Speculative Execution thread")      import Sc.env.actorSystem.dispatcher      Sc.env.actorSystem.scheduler.schedule (speculation_interval milliseconds,            speculation_interval milliseconds) {        utils.tryorexit {checkspeculatabletasks ()}}}}  

The main is the start of backend. For non-native mode. And if Spark.speculation is set to true, a task that is not returned for a specified time will start another task to run. In fact, for general applications, this may actually reduce the time it takes to run a task, but it also wastes the computing resources of the cluster.

Therefore, this setting is not recommended for offline applications.


Org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend is the schedulerbackend of standalone mode. Its definition is for example the following:

Private[spark] class Sparkdeployschedulerbackend (    Scheduler:taskschedulerimpl,    Sc:sparkcontext,    Masters:array[string])  extends Coarsegrainedschedulerbackend (Scheduler, Sc.env.actorSystem) with  Appclientlistener with  Logging {

Take a look at its start:

 Override Def start () {Super.start ()//The endpoint for executors to talk to us val driverurl = "Akka.tcp://%[e mail protected]%s:%s/user/%s ". Format (Sparkenv.driveractorsystemname, Conf.get (" Spark.driver.host "), C Onf.get ("Spark.driver.port"), Coarsegrainedschedulerbackend.actor_name) val args = Seq (Driverurl, "{{executor_id}} "," {{HOSTNAME}} "," {{cores}} "," {{Worker_url}} ") Val extrajavaopts = Sc.conf.getOption (" Spark.executor.extraJavaOptions "). Map (utils.splitcommandstring). Getorelse (seq.empty) val classpathentries = sc.co Nf.getoption ("Spark.executor.extraClassPath"). toseq.flatmap {cp = Cp.split (java.io.File.pathSeparator)} V Al librarypathentries = Sc.conf.getOption ("Spark.executor.extraLibraryPath"). toseq.flatmap {CP = Cp.split    (Java.io.File.pathSeparator)}//Start executors with a few necessary configs for registering with the scheduler Val sparkjavaopts = utils.sparkjavaopts (conf, Sparkconf.isexecutorstartupconf) Val javaopts = sparkjavaopts + + extrajavaopts val command = command ("Org.apache.spark.ex Ecutor. Coarsegrainedexecutorbackend ", args, Sc.executorenvs, classpathentries, Librarypathentries, javaOpts) Val AppDesc = new Applicationdescription (Sc.appname, Maxcores, sc.executormemory, command, Sc.ui.appUIAddress, Sc.eventLogger.map    (_.logdir)) Client = new Appclient (Sc.env.actorSystem, Masters, Appdesc, this, conf) Client.start () waitforregistration ()}


Next, we will be on the TaskScheduler. Schedulerbackend and Dag Scheduler are explained in detail. To gradually uncover their magical veil.


Copyright notice: This article Bo Master original articles, blogs, without consent may not be reproduced.

Task Scheduler in Spark: start from Sparkcontext

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.