Partitioner Partitioning Process Analysis

Source: Internet
Author: User

Partition Chinese means fragmentation of meaning, and this phase is also the third phase of the entire mapreduce process. The return task in map is to make the key divide through a certain partitioning algorithm. into a fixed area. To the different reduce processing, to achieve the purpose of load balancing.

His running process is actually happening in the collect process mentioned in the previous article, when the input key calls the user's map function. The intermediate result will be partitioned. Although this process does not seem to be very important, there are also worthwhile places to learn. Hadoop default algorithm is hashpartitioner, is based on key hashcode to take the operation, very easy.

/** Partition keys by their {@link object#hashcode ()}.  */public class Hashpartitioner<k2, v2> implements PARTITIONER<K2, v2> {public  void Configure (jobconf Job) {}  /** use {@link object#hashcode ()} to partition. */Public  int getpartition (K2 key, V2 value,                          int numre Ducetasks) {    return (Key.hashcode () & integer.max_value)% Numreducetasks;  }}
However, this can guarantee the random distribution of key. But can not guarantee the global orderly implementation, because some requirements require different partitions to show a phased distribution, the first zone all key is less than the second interval. The same second interval is smaller than the third interval, and your stochastic algorithm is only locally ordered, within the interval. However, the key of the first interval is greater than the second interval. So. Here comes a call TotalorderpartitionerClass, which is also the focus of this study.

First look at the related class structure of relationship partition.


Visible. Totalorderpartitioner is still quite complicated.

Totalorderpartitioner's role is to ensure the overall order, for key division, he divided several key sampling points. As the key dividing point, than "2,4,6,8", 4 key sampling points. The interval is divided into 5 parts, assuming that the value of a key is 5, his interval is 4-6. So in the third interval, that is, the function of this class is to surround a given dividing point, looking for his interval number, on behalf of the completion of the task, as for you with a binary search. Or some other kind of algorithm. You are the boss.

Good. First, the first step, from the configuration file to get the partition point, he actually exists in a file called Partition.file, the configuration only retains the path,

public void Configure (jobconf job) {    try {      //Get partition file      String parts = getpartitionfile (Job);      Final path partfile = new path (parts);      Final FileSystem fs = (default_path.equals (parts))        ? Filesystem.getlocal (Job)     //Assume in Distributedcache        : Partfile.getfilesystem (Job);      Class<k> Keyclass = (class<k>) job.getmapoutputkeyclass ();      Read from partition Spilts partition point      k[] splitpoints = readpartitions (FS, Partfile, Keyclass, job);      ....
Spiltpoints will play a key role in the back.

Then start the key operation, assuming that your key value type is not binarycomparable binary comparison type. For example, a numerical type that can be directly compared to a value directly uses a binary algorithm. Creates a binary search node. Pass in your own comparison device implementation:

....      Rawcomparator<k> comparator =        (rawcomparator<k>) job.getoutputkeycomparator ();      for (int i = 0; i < splitpoints.length-1; ++i) {        if (Comparator.compare (Splitpoints[i], splitpoints[i+1]) >= 0 {          throw new IOException ("Split points is out of order");}      }      Boolean natorder =        Job.getboolean ("Total.order.partitioner.natural.order", true);      Infer whether the binarycomparable type, assuming yes, establishes trie tree      if (Natorder && BinaryComparable.class.isAssignableFrom ( Keyclass) {        partitions = Buildtrie ((binarycomparable[]) splitpoints, 0,            splitpoints.length, new byte[0],            job.getint ("Total.order.partitioner.max.trie.depth", 2));      } else {    //assuming that the build Binarysearchnode is set up, use a binary lookup.        partitions = new Binarysearchnode (splitpoints, comparator) with your own built-in comparison device;      }
Continue to the point, inside the algorithm to get the partition number. The direct use of a binary search find:
/** * For types that is not   {@link org.apache.hadoop.io.BinaryComparable} or   * where disabled by <tt>total. Order.partitioner.natural.order</tt>,   * Search the partition keyset with a binary search.   *  /class Binarysearchnode implements Node<k> {//comparison content node    private final k[] splitpoints;    The comparison device    private final rawcomparator<k> comparator;    Binarysearchnode (k[] splitpoints, rawcomparator<k> comparator) {      this.splitpoints = splitpoints;      This.comparator = comparator;    }       /**    * Binary lookup via your own incoming comparison method */public    int findpartition (K key) {      final int pos = Arrays.binarysearch ( Splitpoints, key, comparator) + 1;      Return (POS < 0)?

-pos:pos; } }

But suppose the type of key is assumed to bebinarycomparable binary comparison type (you can understand it as a string type), you have to rely on the creation of Trietree.

The inside is divided into 2 kinds of nodes, Innertrienode and Leaftrienode, all inherit Trienode, Leaftrienode is the leaf node, the bottom is the splitpoints that the partition point just said. Innertrienode is the node above the leaf node. The principle of this trietree is to scan the node from the top down, and finally to the leaf node, return the partition number

There is a kind of binary search tree feeling.

Each inner node retains 255 byte points, representing 255 characters

/**   * An inner trie node, contains, children based on the next   * character.   */  class Innertrienode extends Trienode {    private trienode[] child = new trienode[256];    Innertrienode (int level) {      super;    }    ...
So the final graph is similar to the following, which only shows a A-Z of 26 letters, in fact there should be 255:



It is possible to imagine that the tree is completely expanded, so this is the standard algorithm for space-time, so the process of creating trietree should be recursive, until the deepest depth is reached. At this point, the leaf leaf node should be created, so that the tree is created, see the code implementation:

Private Trienode Buildtrie (binarycomparable[] splits, int lower, int upper, byte[] prefix, int maxDepth) {final I    NT depth = Prefix.length;  if (depth >= maxDepth | | lower = = UPPER) {//depth reaches maximum time, the leaf node should be created return new Leaftrienode (depth, splits, lower,    Upper);    } innertrienode result = new Innertrienode (depth);    byte[] trial = arrays.copyof (prefix, prefix.length + 1);    Append an extra byte in to the prefix int currentbound = lower;      Each parent node has 255 child nodes for (int ch = 0; ch < 255; ++ch) {trial[depth] = (byte) (ch + 1);      lower = Currentbound; while (Currentbound < Upper) {if (Splits[currentbound].compareto (trial, 0, Trial.length) >= 0) {BR        Eak      } Currentbound + = 1;      } Trial[depth] = (byte) ch; Result.child-headed node.                                   Recursive creation of child nodes Result.child[0xff & ch] = Buildtrie (splits, lower, currentbound, trial,    MaxDepth); }//Pick up the rest Trial[dePTH] = 127;    RESULT.CHILD[255] = Buildtrie (splits, Currentbound, upper, Trial, maxDepth);  return result; }
The above steps are only the initialization process, not the key to find the operation to get the partition partition. The flowchart for the build process is as follows:

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvqw5kcm9pzgx1c2hhbmdkzxjlbg==/font/5a6l5l2t/fontsize/400/fill/ I0jbqkfcma==/dissolve/70/gravity/center ">
The next step is to find the process of partitioning by key input keys. The non-binary comparison type is very easy. The results can be known by a binary search directly through their own inserted comparison device. Let's look at how the lookup partition of the string type implemented by Trietree is implemented. The process of building from the above. We know that he is a layer-by-layer search process, for example, you are looking for, AAD, the character. Of course you first have to find the first node A, and then to the first child node of the node is the character a in the search, the last to find the leaf node, in the leaf node search, Hadoop or two points to find, at this time because of its own division of data is not very much, do not need to sort direct search can be.

Here's a look at the implementation of the code, first the Innner node, but the search for the character:

....    /**     * Non-leaf node query *    /public int findpartition (binarycomparable key) {      //get current depth      int level = Getlevel ();            if (Key.getlength () <= level) {        return child[0].findpartition (key);      }            From key in this position the corresponding character child begins to continue searching for the next, key.getbytes () [level] character at level position      return Child[0xff & Key.getbytes () [Level ]].findpartition (key);    }
Suppose the last layer of Leaftrienode is reached. The call is his own method:

....    In the leaf node, the binary lookup partition number public    int findpartition (binarycomparable key) {      final int pos = Arrays.binarysearch ( Splitpoints, lower, upper, key) + 1;      Return (POS < 0)? -pos:pos;    }
Finally returned is also the partition code. It's over. This partitioning algorithm is finally implemented. The standard space time-change algorithm, but to ensure the efficiency of this algorithm, for the partition point of the collection is very important. Must be guaranteed to have a certain representativeness. Ability to ensure the orderly between partitions.

There are 3 collections of classes available in Hadoop:

splitsampler: Sample the first n records
randomsampler: traverse all data, random sample
intervalsampler: fixed interval sampling


The small partition algorithm also contains a lot of strange algorithms, MapReduce This code is really a rare good news ah.

Copyright notice: This article blog original articles, blogs, without consent, may not be reproduced.

Partitioner Partitioning Process Analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.