The spark operator execution process is detailed in five

Source: Internet
Author: User
Tags hash prev require shuffle sort
22.combineByKey

defCombinebykey[c] (createcombiner:v = C,
Mergevalue: (c, V) + = C,
Mergecombiners: (c, c) + = C,
Partitioner:partitioner,
Mapsidecombine:boolean =true,
Serializer:serializer =NULL): rdd[(K, C)] = self.withscope {
Require (mergecombiners! =NULL,"Mergecombiners must be defined")//required as of Spark 0.9.0
if(Keyclass.isarray) {
if(Mapsidecombine) {
throw NewSparkexception ("cannot use map-side combining with array keys.")
}
if(Partitioner.isinstanceof[hashpartitioner]) {
throw NewSparkexception ("Default Partitioner cannot partition array keys.")
}
}
ValAggregator =NewAggregator[k, V, C] (
Self.context.clean (Createcombiner),
Self.context.clean (Mergevalue),
Self.context.clean (mergecombiners))
if(Self.partitioner = = Some (partitioner)) {//If the partition function is the same, no shuffle is required and only one mappartitions is required
Self.mappartitions (iter = {
Valcontext = Taskcontext.get ()
NewInterruptibleiterator (Context, Aggregator.combinevaluesbykey (ITER, context))
}, preservespartitioning =true)
}Else{//otherwise need to be shuffle
NewShuffledrdd[k, V, C] (self, partitioner)
. Setserializer (Serializer)
. Setaggregator (aggregator)
. Setmapsidecombine (Mapsidecombine)
}

}

Follow its entry:

The Combinebykey function mainly accepts three functions as parameters, namely Createcombiner, Mergevalue, Mergecombiners. These three functions are enough to show what it does. By understanding these three functions, you can understand combinebykey well.

To understand Combinebykey (), first understand how it handles each element as it processes the data. Because Combinebykey () iterates through all the elements in the partition, the key for each element has not been encountered or is the same as the previous key. The processing flow of Combinebykey () is as follows:

If this is a new element, use Createcombiner () to create the initial value of the accumulator that corresponds to that key. (. Note: This process occurs when each key appears for the first time in each partition, rather than the first time a key appears in the entire RDD. )

If this is a key that has been encountered before processing the current partition, Combinebykey () uses Mergevalue () to merge the current value corresponding to the key's accumulator with the new value.

Because each partition is handled independently, there can be multiple accumulators for the same key. If there are two or more partitions that have an accumulator that corresponds to the same key, the results of each partition need to be combined using the user-supplied Mergecombiners (). If Mergecombiners is true, a merge is made in advance of the output of the map, and if Mergecombiners is false, a merge occurs at the time of the reduce result. The effect of early merging is to reduce the amount of data transmitted when shuffle reads, and to increase the speed of shuffle read.

First look at the dependencies inside Shufflerdd:

class Shuffledrdd[k, V, C] (
@transient var prev:rdd[_ <: Product2[k, V]],
Part:partitioner)
extends rdd[(K, C)] (Prev.context, Nil) {
...
override Def Getdependencies:seq[dependency[_]] = {//its shuffle write and read are generated by shuffledependency
List (new shuffledependency (prev, part, serializer, keyordering, aggregator, Mapsidecombine))
}
...

}

By registering the shuffle read-write handle to Shufflemanager, the default Shufflemanager is Sortshufflemanager

classShuffledependency[k, V, C] (
@transient _rdd:rdd[_ <: Product2[k, V]],
ValPartitioner:partitioner,
ValSerializer:option[serializer] = None,
ValKeyordering:option[ordering[k]] = None,
ValAggregator:option[aggregator[k, V, C]] = None,
ValMapsidecombine:boolean =false)
extendsDependency[product2[k, V]] {

Override DefRdd:rdd[product2[k, V]] = _rdd.asinstanceof[rdd[product2[k, V]]

ValShuffleid:int = _rdd.context.newshuffleid ()
//Register Shuffle handle
Val
Shufflehandle:shufflehandle = _rdd.context.env.shufflemanager.registershuffle (
Shuffleid, _rdd.partitions.size, This)

_rdd.sparkcontext.cleaner.foreach (_.registershuffleforcleanup ( This))

}

Get their own read and write handles through getwriter and Getreader

Private[Spark]classSortshufflemanager (conf:sparkconf)extendsShufflemanager {

Private ValIndexshuffleblockresolver =NewIndexshuffleblockresolver (CONF)
Private ValShufflemapnumber =NewConcurrenthashmap[int, Int] ()

/**
* Register a shuffle with the manager and obtain a handle for it to pass to tasks.
*/
Override DefRegistershuffle[k, V, C] (
Shuffleid:int,
Nummaps:int,
Dependency:shuffledependency[k, V, C]): Shufflehandle = {
NewBaseshufflehandle (Shuffleid, Nummaps, dependency)
}

/**
* Get a reader for a range of reduce partitions (startpartition to endPartition-1, inclusive).
* Called on executors by reduce tasks.
*/
Override DefGetreader[k, C] (
Handle:shufflehandle,
Startpartition:int,
Endpartition:int,
Context:taskcontext): Shufflereader[k, C] = {
We currently use the same block store shuffle Fetcher as the hash-based shuffle.
NewHashshufflereader (
Handle.asinstanceof[baseshufflehandle[k, _, C]], startpartition, endpartition, context)
}

/** Get A writer for a given partition. Called on executors by map tasks. */
Override DefGetwriter[k, V] (Handle:shufflehandle, Mapid:int, Context:taskcontext)
: Shufflewriter[k, V] = {
ValBaseshufflehandle = Handle.asinstanceof[baseshufflehandle[k, V, _]]
Shufflemapnumber.putifabsent (Baseshufflehandle.shuffleid, Baseshufflehandle.nummaps)
NewSortshufflewriter (
Shuffleblockresolver, Baseshufflehandle, mapId, context)
}

/** Remove A shuffle ' s metadata from the Shufflemanager. */
Override DefUnregistershuffle (shuffleid:int): Boolean = {
if(Shufflemapnumber.containskey (Shuffleid)) {
ValNummaps = Shufflemapnumber.remove (Shuffleid)
(0 until Nummaps). map{mapId =
Shuffleblockresolver.removedatabymap (Shuffleid, MapId)
}
}
true
}

override ValShuffleblockresolver:indexshuffleblockresolver = {
Indexshuffleblockresolver
}

/** Shut down this shufflemanager. */
Override DefStop (): Unit = {
Shuffleblockresolver.stop ()
}

}

First look at the handle Sortshufflewriter:

Private[Spark]classSortshufflewriter[k, V, C] (
Shuffleblockresolver:indexshuffleblockresolver,
Handle:baseshufflehandle[k, V, C],
Mapid:int,
Context:taskcontext)
extendsShufflewriter[k, V] withLogging {

Private ValDEP = Handle.dependency

Private ValBlockmanager = SparkEnv.get.blockManager

private VarSorter:externalsorter[k, V, _] =NULL

is we in the process of stopping? Because map tasks can call Stop () with success = True
And then call Stop () with success = False if they get a exception, we want to make sure
We don ' t try deleting files, etc twice.
private Varstopping =false

private Var
Mapstatus:mapstatus =NULL

Private Val
Writemetrics =NewShufflewritemetrics ()
Context.taskMetrics.shuffleWriteMetrics = Some (writemetrics)

/** Write a bunch of records to this task ' s output */
Override DefWrite (Records:iterator[product2[k, V]]): Unit = {
if(Dep.mapsidecombine) {//Aggregation on map side
Require (dep.aggregator.isDefined,"Map-side combine without aggregator specified!")
Sorter =NewExternalsorter[k, V, C] (
Dep.aggregator, Some (Dep.partitioner), dep.keyordering, Dep.serializer)
Sorter.insertall (Records)
}Else{//Do not converge on the map side, even Dep.partitioner are not passed down, the map is only partitioned by partition function, others do not do anything
In this case we pass neither a aggregator nor an ordering to the sorter, because we don ' t
Care whether the keys get sorted in each partition; That would be do on the reduce side
If the operation being run is sortbykey.
Sorter =NewExternalsorter[k, V, v] (none, Some (Dep.partitioner), none, Dep.serializer)
Sorter.insertall (Records)
}

Don ' t bother including the time to open the merged output file in the shuffle write time,
Because it just opens a single file, so was typically too fast to measure accurately
(see SPARK-3570).
ValOutputFile = Shuffleblockresolver.getdatafile (Dep.shuffleid, MapId)
ValBlockid = Shuffleblockid (Dep.shuffleid, MapId, indexshuffleblockresolver.noop_reduce_id)
ValPartitionlengths = Sorter.writepartitionedfile (blockid, context, OutputFile)
Shuffleblockresolver.writeindexfile (Dep.shuffleid, MapId, partitionlengths)

Mapstatus = Mapstatus (Blockmanager.shuffleserverid, partitionlengths)
}

}

And look at the Insertall of Externalsorter:

defInsertall (Records:iterator[_ <: Product2[k, V]]): Unit = {
//Todo:stop Combining if we find that the reduction factor isn ' t high
Val
Shouldcombine = aggregator.isdefined

if(Shouldcombine) {//If aggregated on the map side, you need to take advantage of the Mergevalue and Createcombiner features
Combine values In-memory first using our Appendonlymap
ValMergevalue = Aggregator.get.mergeValue
ValCreatecombiner = Aggregator.get.createCombiner
varKv:product2[k, V] =NULL
Val
Update = (Hadvalue:boolean, oldvalue:c) + = {
if(Hadvalue) Mergevalue (OldValue, kv._2)ElseCreatecombiner (kv._2)
}
while(Records.hasnext) {
Addelementsread ()
KV = Records.next ()
Map.changevalue ((Getpartition (kv._1), kv._1), update)
Maybespillcollection (Usingmap =true)
}
}else if(Bypassmergesort) {//Otherwise write multiple partition files if the number of partitions is small
Spark-4479:also Bypass buffering If merge sort is bypassed to avoid defensive copies
if(Records.hasnext) {
Spilltopartitionfiles (
Writablepartitionediterator.fromiterator (Records.map {kv =
((Getpartition (kv._1), kv._1), Kv._2.asinstanceof[c])
})
)
}
}Else{//Otherwise if the number of partitions is more, then write a file, afraid of temporary files more
Stick values into our buffer
while(Records.hasnext) {
Addelementsread ()
ValKV = Records.next ()
Buffer.insert (Getpartition (kv._1), Kv._1, Kv._2.asinstanceof[c])
Maybespillcollection (Usingmap =false)
}
}

Then read the handle Hashshufflereader:

Private[Spark]classHashshufflereader[k, C] (
Handle:baseshufflehandle[k, _, C],
Startpartition:int,
Endpartition:int,
Context:taskcontext)
extendsShufflereader[k, C]
{
require (Endpartition = = startpartition + 1,
"Hash Shuffle currently only supports fetching one partition")

Private ValDEP = Handle.dependency

/** Read The combined key-values for this reduce task */
Override DefRead (): Iterator[product2[k, C]] = {
ValSer = Serializer.getserializer (Dep.serializer)
Valiter = Blockstoreshufflefetcher.fetch (Handle.shuffleid, startpartition, context, Ser)

ValAggregatediter:iterator[product2[k, C]] =if(dep.aggregator.isDefined) {
if(Dep.mapsidecombine) {//If it has been aggregated on the map side, the data aggregation can be done using mergecombiners
New
Interruptibleiterator (Context, Dep.aggregator.get.combineCombinersByKey (ITER, context))
}Else{//otherwise aggregated using Createcombiner and Mergevalue
New
Interruptibleiterator (Context, Dep.aggregator.get.combineValuesByKey (ITER, context))
}
}Else{
Require (!dep.mapsidecombine,"Map-side combine without aggregator specified!")

Convert the product2s to pairs since this is what downstream RDDs currently expect
Iter.asinstanceof[iterator[product2[k, c]]].map (pair = (pair._1, pair._2))
}

Sort the output if there is a sort ordering defined.
Dep.keyorderingMatch{
CaseSome (Keyord:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.