Apache Spark Source 3--function call relationship analysis of task run time

Source: Internet
Author: User

Profile

This article focuses on how the business logic of a task executed in Taskrunner is invoked, and also attempts to clarify where the data entered by the running task is fetched, where the result of the processing is returned, and how it is returned.

Get ready
    1. Spark is already installed
    2. Spark runs in local mode or Local-cluster mode
Local-cluster mode

Local-cluster mode, also known as pseudo-distributed, can be run with the following command

MASTER=local[1,2,1024] bin/spark-shell

[1,2,1024] indicates, executor number, core number, and memory size, where the memory size should not be less than the default 512M

Driver programme initialization process Analysis of the primary source files involved in the initialization process
    1. Sparkcontext.scala the entrance to the entire initialization process
    2. Sparkenv.scala Create Blockmanager, Mapoutputtrackermaster, ConnectionManager, CacheManager
    3. Dagscheduler.scala Task submission Entry, the job is divided into the key of each stage
    4. Taskschedulerimpl.scala determines how many tasks each stage can run, and which executor each task runs on
    5. Schedulerbackend
      1. The simplest single-machine operation mode, see Localbackend.scala
      2. If it is cluster mode, look at the source file Sparkdeployschedulerbackend
Initialization process steps in detail

Step 1: Generate the sparkconf based on the initialization parameters, and then create the sparkenv according to Sparkconf, sparkenv mainly contains the following key components 1. Blockmanager 2. Mapoutputtracker 3. Shufflefetcher 4. ConnectionManager

private[spark] val env = SparkEnv.create(    conf,    "",    conf.get("spark.driver.host"),    conf.get("spark.driver.port").toInt, isDriver = true, isLocal = isLocal) SparkEnv.set(env)

Step 2: Create the TaskScheduler, select the appropriate schedulerbackendaccording to the Spark's operating mode, and start the TaskScheduler at the same time, this step is critical

  private[spark] var taskScheduler = SparkContext.createTaskScheduler(this, master, appName)  taskScheduler.start()

Taskscheduler.start purpose is to start the corresponding schedulerbackend and start the timer to detect

override def start() {    backend.start()    if (!isLocal && conf.getBoolean("spark.speculation", false)) { logInfo("Starting speculative execution thread") import sc.env.actorSystem.dispatcher sc.env.actorSystem.scheduler.schedule(SPECULATION_INTERVAL milliseconds, SPECULATION_INTERVAL milliseconds) { checkSpeculatableTasks() } } }

Step 3: The TaskScheduler instance created in the previous step creates a Dagscheduler for the incoming parameter and starts the run

@volatile private[spark] var dagScheduler = new DAGScheduler(taskScheduler)  dagScheduler.start()

Step 4: Start the Web UI

ui.start()
The conversion process of the RDD

Or the simplest example of wordcount to illustrate the process of RDD conversion

sc.textFile("README.md").flatMap(line=>line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)

The above line of short code in fact has a very complex RDD conversion, the following carefully explain each step of the conversion process and conversion results

Step 1:val rawfile = Sc.textfile ("readme.md")

Textfile first generates the Hadooprdd, and then generates MAPPEDRDD through the map operation, if the above statement is executed in Spark-shell, the result can prove that the analysis

Scala> Sc.textfile ("Readme.md")14/04/2313:11:WARN sizeestimator:failed to check whether Usecompressedoops is set; Assuming yes14/04/2313:11:Memorystore:ensurefreespace INFO (119741) calledWith curmem=0, maxmem=31138775014/04/2313:11:Memorystore:block INFO broadcast_0 stored as values to memory (estimated size116.9 KB, free 296.8 MB)14/04/ : one:DEBUG blockmanager:put block Broadcast_0 Local Ly took 277 Ms14/04/ : One: $DEBUG blockmanager:put for block Broadcast_0 Witho UT replication took 281 msres0:org.apache.spark.rdd.rdd[string] = mappedrdd[1] at textfile at:
                                              
Step 2:val splittedtext = rawfile.flatmap (line = Line.split (""))

Flatmap converted the original mappedrdd into Flatmappedrdd

def flatMap[U: ClassTag](f: T => TraversableOnce[U]): RDD[U] =                                                                                                  new FlatMappedRDD(this, sc.clean(f))
Step 3:val WordCount = splittedtext.map (Word = + (Word, 1))

Using Word to generate corresponding key-value pairs, the Flatmappedrdd of the previous step is converted into Mappedrdd

Step 4:val Reducejob = Wordcount.reducebykey (_ + _), this step is the most complex

The operation used in step 2,3 are all defined in Rdd.scala, and the Reducebykey used here are not visible in Rdd.scala. The definition of Reducebykey appears in the source file Pairrddfunctions.scala

Careful you will be asked Reducebykey is not mappedrdd properties and methods Ah, how can be Mappedrdd call it? In fact, there is an implicit conversion that transforms Mappedrdd into pairrddfunctions

def rddToPairRDDFunctions[K: ClassTag, V: ClassTag](rdd: RDD[(K, V)]) =    new PairRDDFunctions(rdd)

This implicit conversion is a syntactic feature of Scala, and if you want to know more, please use the keyword "Scala implicit method" to query, there will be a lot of articles on this detailed introduction.

And then look at the definition of Reducebykey.

  Def Reducebykey (func: (V, v) = v): rdd[(K, v)] = {Reducebykey (Defaultpartitioner (self), func)}def reducebykey (Partitioner:partitioner, func: (V, v) = v): rdd[(K, v)] = {Combinebykey[v] ((v:v) = V, func, Func, Partitioner)}def combinebykey[c] (createcombiner:v = C, Mergevalue: (c, V) and C, Mergecombiners: (c, c) + = C, Partitioner:partitioner, Mapsidecombine:boolean =True, serializerclass:string =NULL): rdd[(K, C)] = {if (Getkeyclass (). IsArray) {if (mapsidecombine) {ThrowNew Sparkexception ("Cannot use map-side combining with array keys.") }if (Partitioner.isinstanceof[hashpartitioner]) {ThrowNew Sparkexception ("Default Partitioner cannot partition array keys.") } }Val aggregator =New Aggregator[k, V, C] (Createcombiner, Mergevalue, Mergecombiners)if (Self.partitioner = = Some (partitioner)) {Self.mappartitionswithcontext (context, iter) = {New Interruptibleiterator (Context, Aggregator.combinevaluesbykey (ITER, Context))}, preservespartitioning =True)}Elseif (mapsidecombine) {Val Combined = Self.mappartitionswithcontext (context, iter) = {Aggregator.combinevaluesbykey (ITER, context)}, Preservespartitioning =true) val partitioned = new Shuffledrdd[k, C, (K, C)] (combined, partitioner). Setserializer (Serializerclass) Partitioned.mappartitionswithcontext ((context, iter) = {new interruptibleiterator (context, Aggregator.combinecombinersbykey (ITER, Context))}, preservespartitioning = true)} else { //Don ' t apply map-side combiner. val values = new shuffledrdd[k, V, (K, V)] (self, partitioner). Setserializer (Serializerclass) Values.mappartitionswithcontext ((context, iter) = {true)}}           

Reducebykey will eventually call Combinebykey, in which Pairedrddfunctions will be converted into Shufflerdd, and when Mappartitionswithcontext is called, Shufflerdd was converted into Mappartitionsrdd

The log output can prove our analysis

res1: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[8] at reduceByKey at :13
RDD Conversion Summary

Summarize the entire RDD conversion process

Hadooprdd->mappedrdd->flatmappedrdd->mappedrdd->pairrddfunctions->shufflerdd-> Mappartitionsrdd

The whole conversion process is long, and all of this is happening before the task is committed.

Run a Process analysis DataSet operations category

Before we analyze the function call relationships in the task run, we also discuss a biased theory, why does the transformantion on the rdd look like this?

The solution to this problem is related to mathematics, and from the point of view of theory abstraction, task processing can be attributed to "Input->processing->output". Input and output correspond to dataset datasets.

On this basis, make a simple classification

    1. One-one a DataSet is still a dataset after conversion, and the size of the dataset remains the same as the map
    2. One-one a dataset is converted or a dataset, but size changes, there are two possible changes: widening or shrinking, such as Flatmap is a size-increased operation, and subtract is a smaller size operation
    3. Many-one multiple datasets merged into a single dataset, such as combine, join
    4. One-many a dataset splits into multiple datasets, such as GroupBy
function call for the task run time

The submission process for a task references the second article in this series. This section focuses on how a task is called to each operation that acts on the RDD while it is running.

    • Taskrunner.run
      • Task.run
        • Task.runtask (Task is a base class with two subclasses, Shufflemaptask and Resulttask, respectively)
          • Rdd.iterator
            • Rdd.computeorreadcheckpoint
              • Rdd.compute

Perhaps when I see the Rdd.compute function definition, I still think F is not called, as an example of MAPPEDRDD's compute definition

  override def compute(split: Partition, context: TaskContext) =                                                                                                          firstParent[T].iterator(split, context).map(f)  

Note that the easiest place to create the illusion here is the map function, where map is not a map in the RDD, but a member function map of iterator defined in Scala, please refer to it yourself http://www.scala-lang.org/ Api/2.10.4/index.html#scala.collection.iterator

Stack output
 At Org.apache.spark.rdd.HadoopRDD.getJobConf (Hadooprdd.scala:111)Bayi at org.apache.spark.rdd.hadooprdd$ $anon $1. (Hadooprdd.scala:154)At Org.apache.spark.rdd.HadoopRDD.compute (Hadooprdd.scala:149)At Org.apache.spark.rdd.HadoopRDD.compute (Hadooprdd.scala:64)At Org.apache.spark.rdd.RDD.computeOrReadCheckpoint (Rdd.scala:241)At Org.apache.spark.rdd.RDD.iterator (Rdd.scala:232)At Org.apache.spark.rdd.MappedRDD.compute (Mappedrdd.scala:31)At Org.apache.spark.rdd.RDD.computeOrReadCheckpoint (Rdd.scala:241)At Org.apache.spark.rdd.RDD.iterator (Rdd.scala:232)At Org.apache.spark.rdd.FlatMappedRDD.compute (Flatmappedrdd.scala:33)At Org.apache.spark.rdd.RDD.computeOrReadCheckpoint (Rdd.scala:241)At Org.apache.spark.rdd.RDD.iterator (Rdd.scala:232)At Org.apache.spark.rdd.MappedRDD.compute (Mappedrdd.scala:31)At Org.apache.spark.rdd.RDD.computeOrReadCheckpoint (Rdd.scala:241)94 at Org.apache.spark.rdd.RDD.iterator (Rdd.scala:232)At Org.apache.spark.rdd.MapPartitionsRDD.compute (Mappartitionsrdd.scala: (Rdd.scala:241) at Org.apache.spark.rdd.RDD.computeOrReadCheckpoint  Org.apache.spark.rdd.RDD.iterator (Rdd.scala:232) 98 at Org.apache.spark.scheduler.ShuffleMapTask.runTask ( Shufflemaptask.scala:161) at Org.apache.spark.scheduler.ShuffleMapTask.runTask (Shufflemaptask.scala: 102) At Org.apache.spark.scheduler.Task.run (Task.scala: +)101 at org.apache.spark.executor.executor$ taskrunner$ $anonfun $run$1.apply$mcv$sp (Executor.scala:211)       
Resulttask

Compute calculation process for shufflemaptask more complex, around the circle more, for Resulttask directly many.

override def runTask(context: TaskContext): U = {    metrics = Some(context.taskMetrics)    try {      func(context, rdd.iterator(split, context))    } finally {      context.executeOnCompleteCallbacks()    }  } 
Transfer of calculation results

The above analysis that wordcount this job after the final submission, was Dagscheduler divided into two stages, the first stage is Shufflemaptask, the second stage is resulttask.

So how did the Shufflemaptask calculation result be obtained by resulttask? This process is outlined below

    1. Shffulemaptask the calculated State (note not specific data) wrapped as Mapstatus returned to Dagscheduler
    2. Dagscheduler Saving Mapstatus to Mapoutputtrackermaster
    3. Resulttask calls the Blockstoreshufflefetcher fetch method to fetch the data when it executes to Shufflerdd
      1. The first thing is to consult the location of the data that Mapoutputtrackermaster is going to take.
      2. Call Blockmanager.getmultiple to get real data based on the returned results

Pseudo code of FETCH function for Blockstoreshufflefetcher

    val blockManager = SparkEnv.get.blockManager    val startTime = System.currentTimeMillis    val statuses = SparkEnv.get.mapOutputTracker.getServerStatuses(shuffleId, reduceId)    logDebug("Fetching map output location for shuffle %d, reduce %d took %d ms".format(      shuffleId, reduceId, System.currentTimeMillis - startTime))    val blockFetcherItr = blockManager.getMultiple(blocksByAddress, serializer) val itr = blockFetcherItr.flatMap(unpackBlock) 

Note the getserverstatuses and getmultiplein the above code, one is the location of the query data, and the other is to get the real data.

For a detailed explanation of shuffle, please refer to the "Shuffle implementation of spark" in detail http://jerryshao.me/architecture/2014/01/04/spark-shuffle-detail-investigation/

Apache Spark Source 3--function call relationship analysis of task run time

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.