Hadoop-2.4.1 Study Map task source analysis (bottom)

Source: Internet
Author: User

In the Map task source analysis (above), the code of the map phase is studied, and this article will learn the sort phase of the map task. If the number of reducer is not 0, then the sort phase is also required, but from the above study it is not found that the call to Mapphase.complete () similar to the map stage execution is completed calling Sortphase.complete () Source, when did the sort phase start? For the map task, there is input output, input by Recordreader responsible, the output is recordwriter responsible, when the number of reducer is not 0 o'clock, Recordwriter is Newoutputcollector (the class is a private inner class of maptask), the sort stage processes the output of the map, thus inferring that the work of the sort phase is done by Newoutputcollector, The following will verify whether this inference is true by analyzing the source code of the Newoutputcollector. The class inherits from Recordwriter and has the following variables:

Private final mapoutputcollector<k,v> collector;//is responsible for the actual output operation private final Org.apache.hadoop.mapreduce.partitioner<k,v> partitioner;//partition The number of private final int partitions;//partitions on the key space, The same number as the reducer
The constructor for this class is as follows:

Collector = Createsortingcollector (Job, reporter);p artitions = Jobcontext.getnumreducetasks (); if (Partitions > 1) { Partitioner = (org.apache.hadoop.mapreduce.partitioner<k,v>) reflectionutils.newinstance ( Jobcontext.getpartitionerclass (), job);} else {partitioner = new org.apache.hadoop.mapreduce.partitioner<k,v> () {@Overridepublic int getpartition (K key, V value, int numpartitions) {return partitions-1;}};}

No information about the sort phase is found in the code, but sorting is found, which infers that the method Createsortingcollector has the greatest possibility. The source code for this method is as follows:

Constructs a mapoutputcollector//based on the value of Mapreduce.job.map.output.collector.class, without specifying the value of the parameter, Returns Mapoutputbuffer Object Mapoutputcollector<key, value> collector= (Mapoutputcollector<key, VALUE>) Reflectionutils.newinstance. (Job.getclass (Jobcontext.map_output_collector_class_attr,mapoutputbuffer.class, MapOutputCollector.class), job); Log.info ("Map Output collector class =" + Collector.getclass (). GetName ()); Mapoutputcollector.context Context =new Mapoutputcollector.context (This, job, reporter);// Default Call Mapoutputbuffer Init method Collector.init (context); return collector;

After a series of analyses above, the work of the sort phase was completed by Newoutputcollector, and newoutputcollector the sort work to Mapoutputcollector, This is ultimately done by the implementation class Mapoutputbuffer of the interface, which takes up more than half the number of rows in the Maptask source code (maptask rows of 2000 rows and mapoutputbuffer about 1100 rows), which is the inner class of the maptask. However, the importance of this class can be derived from the number of rows. The first part of the Init method of the Mapoutputbuffer code builds the cache based on the settings, and the code has added the relevant comments:

Sanity checks//When the Kvbuffer cache reaches this value, the overflow thread writes the cached content to the hard disk final float Spillper =job.getfloat (jobcontext.map_sort_spill_ PERCENT, (float) 0.8);//The size of the cache used to sort the file, the default is 100MBfinal int SORTMB = Job.getint (JOBCONTEXT.IO_SORT_MB, 100);// Mapreduce.task.index.cache.limit.bytes, the default is 1024x768 * 1024x768 (1M) Indexcachememorylimit = Job.getint (jobcontext.index_  Cache_memory_limit, Index_cache_memory_limit_default); if (Spillper > (float) 1.0 | | Spillper <= (float) 0.0) {throw new IOException ("Invalid \" "+ jobcontext.map_sort_spill_percent +" \ ":" + SPI Llper);} The maximum value of the SORTMB is 2047Mb (111 1111 1111), the lowest 11 bits of SORTMB//if greater than 2047Mb, the lower left shift 20 bits will cause overflow if ((SORTMB & 0x7ff)! = SORTMB) {throw NE W ioexception ("Invalid \" "+ JOBCONTEXT.IO_SORT_MB +" \ ":" + SORTMB);} Default Use quick Sort Sorter = reflectionutils.newinstance (Job.getclass ("Map.sort.class", Quicksort.class, Indexedsorter.class ), job),//Buffers and accounting//sortbm* (2 of 20), convert SORTBM to bytes (1024*1024,2 10 times multiplied by 2 10 times) int maxmemusage = SORTMB < < 20;//METASIZE=16,MAXMEMUSAGE=SORTMB << 20maxMemUsage-= maxmemusage% Metasize;kvbuffer = new Byte[maxmemusage]; bufvoid = Kvbuffer.length;kvmeta = Bytebuffer.wrap (Kvbuffer). Order (Byteorder.nativeorder ()). Asintbuffer (); Setequator (0);//equator: The origin of the tag metadata or serialized data//bufstart: The starting position of the tag overflow, Bufend: Tag collects the starting position of the receipt//bufindex: marks the end location of the collected data. All initialized to 0bufstart = Bufend = Bufindex = Equator;//kvstart: The origin of the tag overflow metadata, Kvend: The end location of the tag overflow metadata//kvindex: Marks the end position of a fully serialized record Kvstart = Kvend = Kvindex;maxrec = Kvmeta.capacity ()/nmeta;softlimit = (int) (KVBUFFER.LENGTH * spillper); bufferremaining = Softlimit;

The second part of the Init method of Mapoutputbuffer starts the Spillthread thread, which is used to complete the work of the sort phase and is responsible for spilling the data in the cache. In the thread of the Run method called the Sortandspill method, by the method name can be learned that the method is responsible for the map output of the sorting and overflow work, the source code of the ordering section is as follows:

Final int mstart = kvend/nmeta;//Kvend is a valid recordfinal int mend = 1 + (Kvstart >= kvend? Kvstart          : Kvmeta . Capacity () + Kvstart)/nmeta;//sort the specified range of data, using the default Quicksortsorter.sort (Mapoutputbuffer.this, Mstart, mend, reporter) ;

After sorting out the data, you need to write the sorted data to the file, the source code is as follows:

int spindex = mstart;//Record index of Startoffset, rawlength and partlengthfinal indexrecord rec = new Indexrecord ();// The inner class that encapsulates the byte representation of value is final inmemvalbytes value = new Inmemvalbytes (); for (int i = 0; i < partitions; ++i) {//is responsible for writing the output of the map to the   Inter-file ifile.writer<k, v> Writer = null;       try {Long segmentstart = Out.getpos ();       writer = new writer<k, v> (Job, out, Keyclass, Valclass, Codec,spilledrecordscounter);          if (Combinerrunner = = null) {//spill directly Datainputbuffer key = new Datainputbuffer (); while (Spindex < mend &&kvmeta.get (offsetfor (spindex% maxrec) + PARTITION) = = i) {final int kv                Off = offsetfor (spindex% maxrec);                int keystart = Kvmeta.get (Kvoff + Keystart);                int valstart = Kvmeta.get (Kvoff + valstart);                Key.reset (Kvbuffer, Keystart, Valstart-keystart);                Getvbytesforoffset (Kvoff, value);                Writer.append (key, value); ++spinDex           }} else {int spstart = Spindex;           while (Spindex < Mend&&kvmeta.get (offsetfor (spindex% maxrec) + PARTITION) = = i) {++spindex; }//Note:we would like to avoid the combiner if we ' ve fewer//than some threshold of record             s for a partition if (Spstart! = Spindex) {combinecollector.setwriter (writer);             Rawkeyvalueiterator kviter =new mrresultiterator (Spstart, Spindex);            Combinerrunner.combine (Kviter, combinecollector);       }}//Close the writer Writer.close ();       Record offsets rec.startoffset = Segmentstart;       Rec.rawlength = Writer.getrawlength ();       Rec.partlength = Writer.getcompressedlength ();       Spillrec.putindex (REC, i);   writer = null;   } finally {if (null! = writer) writer.close (); }}

When the cache that holds the index exceeds the limit, the index is saved to a file with the following source code:

if (totalindexcachememory >= indexcachememorylimit) {//Create spill index File//map_output_index_record_length value is 24 That represents the size of each record in the index file Path indexfilename =mapoutputfile.getspillindexfileforwrite (numspills, partitions                  * map_output_ Index_record_length); Spillrec.writetofile (Indexfilename, job);} else {indexcachelist.add (SPILLREC); totalindexcachememory +=spillrec.size () * MAP_OUTPUT_INDEX_RECORD_LENGTH;}

Combined with the above analysis, when executing context.write in the map method, data is written to the cache, and when the data in the cache reaches a pre-set threshold, the background spillthread thread is responsible for sorting the data and spilling the data into the intermediate output file of the map task.

Hadoop-2.4.1 Study Map task source analysis (bottom)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.