Apache Spark Source Code Reading 3 -- Analysis of function call relationships during Task Runtime

Source: Internet
Author: User
Tags shuffle

You are welcome to reprint it. Please indicate the source, huichiro.

Summary

This article mainly describes how the business logic of a task executed in taskrunner is called. In addition, it tries to clarify where the input data of a running task is obtained, where and how to return the processing result.

Preparation
  1. Spark has been installed
  2. Spark runs in local mode or local-cluster mode
Local-cluster mode

The local-cluster mode is also known as pseudo-distributed. Run the following command:

MASTER=local[1,2,1024] bin/spark-shell

[1, 1, 1024]Executor number, core number, and memory size, respectively. The memory size should not be smaller than the default 512 MB.

Analysis of the initialization process of driver programme the main source files involved in the initialization process
  1. Sparkcontext. Scala entry to the entire initialization process
  2. Sparkenv. Scala creates blockmanager, mapoutputtrackermaster, connectionmanager, and cachemanager
  3. Dagscheduler. Entry to Scala job submission, which divides jobs into Key Stages
  4. Taskschedulerimpl. Scala determines the executor on which each stage can run several tasks.
  5. Schedulerbackend
    1. For the simplest standalone running mode, see localbackend. Scala
    2. For cluster mode, check the source file sparkdeployschedulerbackend.
Detailed steps of initialization

Step 1: Generate sparkconf Based on the initialization input parameters, and then create sparkenv Based on sparkconf. sparkenv mainly includes the following key components: 1. blockmanager 2. mapoutputtracker 3. shufflefetcher 4. connectionmanager

 private[spark] val env = SparkEnv.create(    conf,    "",    conf.get("spark.driver.host"),    conf.get("spark.driver.port").toInt,    isDriver = true,    isLocal = isLocal)  SparkEnv.set(env)

Step 2: Create taskscheduler and select the correspondingSchedulerbackendAnd start taskscheduler. This step is critical.

  private[spark] var taskScheduler = SparkContext.createTaskScheduler(this, master, appName)  taskScheduler.start()

Taskschedend. Start is used to start the corresponding schedulerbackend and start the timer for detection.

override def start() {    backend.start()    if (!isLocal && conf.getBoolean("spark.speculation", false)) {      logInfo("Starting speculative execution thread")      import sc.env.actorSystem.dispatcher      sc.env.actorSystem.scheduler.schedule(SPECULATION_INTERVAL milliseconds,            SPECULATION_INTERVAL milliseconds) {        checkSpeculatableTasks()      }    }  }

Step 3: The tasksched instance created in the preceding step is created as an input parameter.DagschedulerAnd start running

@volatile private[spark] var dagScheduler = new DAGScheduler(taskScheduler)  dagScheduler.start()

Step 4: Start the Web UI

ui.start()
RDD conversion process

The simplest wordcount is used as an example to describe the conversion process of RDD.

sc.textFile("README.md").flatMap(line=>line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)

The preceding line of short code actually involves a very complex RDD conversion. The following describes the conversion process and result of each step.

Step 1: Val rawfile = SC. textfile ("readme. md ")

Textfile first generates hadooprdd, and then generates mappedrdd through the map operation. If you execute the preceding statement in spark-shell, the result can prove the analysis.

scala> sc.textFile("README.md")14/04/23 13:11:48 WARN SizeEstimator: Failed to check whether UseCompressedOops is set; assuming yes14/04/23 13:11:48 INFO MemoryStore: ensureFreeSpace(119741) called with curMem=0, maxMem=31138775014/04/23 13:11:48 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 116.9 KB, free 296.8 MB)14/04/23 13:11:48 DEBUG BlockManager: Put block broadcast_0 locally took  277 ms14/04/23 13:11:48 DEBUG BlockManager: Put for block broadcast_0 without replication took  281 msres0: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at :13
Step 2: Val splittedtext = rawfile. flatmap (line => line. Split (""))

Flatmap converts the original mappedrddFlatmappedrdd

 def flatMap[U: ClassTag](f: T => TraversableOnce[U]): RDD[U] =                                                                                                  new FlatMappedRDD(this, sc.clean(f))
Step 3: Val wordcount = splittedtext. Map (WORD => (word, 1 ))

Use Word to generate corresponding key-value pairs. The flatmappedrdd in the previous step is converted to mappedrdd.

Step 4: Val performancejob = wordcount. performancebykey (_ + _), this step is the most complex

The Operation Used in step 2 and 3 is all defined in RDD. Scala, andReducebykeyBut not in RDD. Scala. The definition of cecebykey appears in the source file.Pairrddfunctions. Scala

Careful, you will certainly ask reducebykey not the mappedrdd attribute and method. How can it be called by mappedrdd? In fact, there is an implicit conversion behind this, which converts mappedrdd into pairrddfunctions

implicit def rddToPairRDDFunctions[K: ClassTag, V: ClassTag](rdd: RDD[(K, V)]) =    new PairRDDFunctions(rdd)

This implicit conversion is a syntactic feature of scala. If you want to know more, use the keyword "Scala implicit method" for query. Many articles will detail this.

Next, let's take a look at the definition of performancebykey.

  def reduceByKey(func: (V, V) => V): RDD[(K, V)] = {    reduceByKey(defaultPartitioner(self), func)  }  def reduceByKey(partitioner: Partitioner, func: (V, V) => V): RDD[(K, V)] = {    combineByKey[V]((v: V) => v, func, func, partitioner)  }  def combineByKey[C](createCombiner: V => C,      mergeValue: (C, V) => C,      mergeCombiners: (C, C) => C,      partitioner: Partitioner,      mapSideCombine: Boolean = true,      serializerClass: String = null): RDD[(K, C)] = {    if (getKeyClass().isArray) {      if (mapSideCombine) {        throw new SparkException("Cannot use map-side combining with array keys.")      }      if (partitioner.isInstanceOf[HashPartitioner]) {        throw new SparkException("Default partitioner cannot partition array keys.")      }    }    val aggregator = new Aggregator[K, V, C](createCombiner, mergeValue, mergeCombiners)    if (self.partitioner == Some(partitioner)) {      self.mapPartitionsWithContext((context, iter) => {        new InterruptibleIterator(context, aggregator.combineValuesByKey(iter, context))      }, preservesPartitioning = true)    } else if (mapSideCombine) {      val combined = self.mapPartitionsWithContext((context, iter) => {        aggregator.combineValuesByKey(iter, context)      }, preservesPartitioning = true)      val partitioned = new ShuffledRDD[K, C, (K, C)](combined, partitioner)        .setSerializer(serializerClass)      partitioned.mapPartitionsWithContext((context, iter) => {        new InterruptibleIterator(context, aggregator.combineCombinersByKey(iter, context))      }, preservesPartitioning = true)    } else {      // Don‘t apply map-side combiner.      val values = new ShuffledRDD[K, V, (K, V)](self, partitioner).setSerializer(serializerClass)      values.mapPartitionsWithContext((context, iter) => {        new InterruptibleIterator(context, aggregator.combineValuesByKey(iter, context))      }, preservesPartitioning = true)    }  }

Reducebykey will eventually call combinebykey. In this function, pairedrddfunctions will be convertedShufflerdd,After mappartitionswithcontext is called, shufflerdd is converted to mappartitionsrdd.

Log output can prove our analysis

res1: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[8] at reduceByKey at :13
RDD conversion Summary

Summary of the entire RDD conversion process

Hadooprdd-> mappedrdd-> flatmappedrdd-> mappedrdd-> pairrddfunctions-> shufflerdd-> mappartitionsrdd

The entire conversion process is long. All these conversions occur before the task is submitted.

Operation category of the dataset for running process analysis

Before analyzing the function call relationship during task running, we also discuss a biased theory. Why does transformantion act on RDD?

The answer to this question is related to mathematics. From the theoretical abstraction point of view, task processing can all be attributed to "input-> processing-> output ". Input and Output correspond to dataset.

Make a simple classification on this basis

  1. One-one a dataset is still a dataset after conversion, and the size of the dataset remains unchanged, such as map
  2. One-one dataset is a dataset after conversion, but the size is changed. There are two possible reasons for this change: expand or contract. For example, flatmap is an operation for increasing the size, while subtract is an operation with smaller size.
  3. Merge-one multiple dataset are merged into one dataset, such as combine and join.
  4. One-worker a dataset is split into multiple dataset, such as groupby
Function calls during Task Runtime

For more information about the task submission process, see the second article in this series. This section describes how to call a task step by step to each operation on the RDD during running.

  • Taskrunner. Run
    • Task. Run
      • Task. runtask (a task is a base class and has two sub-classes: shufflemaptask and resulttask)
        • RDD. iterator
          • RDD. computeorreadcheckpoint
            • RDD. compute

Maybe when we see the RDD. compute function definition, we still feel that F is not called. Take the compute definition of mappedrdd as an example.

  override def compute(split: Partition, context: TaskContext) =                                                                                                          firstParent[T].iterator(split, context).map(f)  

Note: The map function is the most likely illusion here. The map here is not the map in RDD,It is the member function map of iterator defined in Scala., Please refer to the http://www.scala-lang.org/api/2.10.4/index.html#scala.collection.Iterator

Stack output
 80         at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:111) 81         at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:154) 82         at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:149) 83         at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:64) 84         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) 85         at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) 86         at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) 87         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) 88         at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) 89         at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33) 90         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) 91         at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) 92         at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) 93         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) 94         at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) 95         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34) 96         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) 97         at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) 98         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) 99         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)100         at org.apache.spark.scheduler.Task.run(Task.scala:53)101         at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211)
Resulttask

Compute's computation process is complex for shufflemaptask and involves many circles. It is much more direct for resulttask.

override def runTask(context: TaskContext): U = {    metrics = Some(context.taskMetrics)    try {      func(context, rdd.iterator(split, context))    } finally {      context.executeOnCompleteCallbacks()    }  } 
Transfer of computing results

The analysis above shows that after the wordcount job is finally submitted, it is divided into two stages by dagscheduler. The first stage is shufflemaptask, and the second stage is resulttask.

Then how is the shufflemaptask computing result obtained by resulttask? The process is described as follows:

  1. Shffulemaptask packs the computing status (not specific data) as mapstatus and returns it to dagscheduler.
  2. Dagscheduler saves mapstatus to mapoutputtrackermaster.
  3. When executing shufflerdd, resulttask calls the fetch method of blockstoreshufflefetcher to obtain data.
    1. The first thing is to consult the location of the data that mapoutputtrackermaster wants.
    2. Call blockmanager. getmultiple to obtain real data based on the returned results.

Fetch function pseudo code of blockstoreshufflefetcher

    val blockManager = SparkEnv.get.blockManager    val startTime = System.currentTimeMillis    val statuses = SparkEnv.get.mapOutputTracker.getServerStatuses(shuffleId, reduceId)    logDebug("Fetching map output location for shuffle %d, reduce %d took %d ms".format(      shuffleId, reduceId, System.currentTimeMillis - startTime))    val blockFetcherItr = blockManager.getMultiple(blocksByAddress, serializer)    val itr = blockFetcherItr.flatMap(unpackBlock) 

Note thatGetserverstatusesAndGetmultipleOne is the location where the data is queried, and the other is to obtain the real data.

For a detailed description of shuffle, see "exploring the shuffle Implementation of spark in detail" http://jerryshao.me/architecture/2014/01/04/spark-shuffle-detail-investigation/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.