Learning Log---partitioner and samplers

Last Update:2015-09-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In MapReduce:

The shuffle phase is between map and reduce and can be custom sorted, custom partitioned and custom grouped!

In MapReduce, map data is a key-value pair, and the default is Hashpatitionner to partition the data from the map;

There are several other ways to partition:

Randomsampler<text, text> sampler = new Inputsampler.randomsampler<text, text> (0.5, 3000, Intervalsampler<text, text> sampler2 = new Inputsampler.intervalsampler<text, text> (0 .333, 10); Splitsampler<text, text> sampler3 = new Inputsampler.splitsampler<text, text> (reducenumber );

Implementation and details

public class totalsortmr {           @ Suppresswarnings ("deprecation")     public static int runtotalsortjob (String [] args)  throws Exception {           Path inputpath = new path (Args[0]);           path outputpath = new path (args[1]);           path partitionfile = new path (args[2]);           int reducenumber = integer.parseint (args[3]);                     //three types of sampler          randomsampler<text, text> sampler = new  inputsampler.randomSampler<text, text> (1, 3000, 10);         Intervalsampler<text, text> sampler2 = new inputsampler.intervalsampler<text , text> (0.333, 10);        splitsampler<text,  Text> sampler3 = new inputsampler.splitsampler<text, text> (ReduceNumber);                 //Task Initialization          configuration conf = new configuration ();           job job = job.getinstance (conf);                 job.setjobname (" Total-sort ");           job.setjarbyclass (TotalSortMR.class);      &nbSp;    job.setinputformatclass (Keyvaluetextinputformat.class);           job.setmapoutputkeyclass (Text.class);           job.setmapoutputvalueclass (Text.class);           job.setnumreducetasks (Reducenumber);           //set all the partition classes         job.setpartitionerclass ( totalorderpartitioner.class); partition file for           //partition class reference          totalorderpartitioner.setpartitionfile (Conf, partitionFile); What sampler does the           //partition use          inputsampler.writepartitionfile (Job, sampler);                 &nbsThe input and output paths of the P;//job         fileinputformat.setinputpaths (job,  InputPath);           fileoutputformat.setoutputpath (Job,  outputpath);           outputpath.getfilesystem (conf). Delete (outputpath, true);                     return job.waitforcompletion (True)? 0 : 1;     }            public static  void main (String[] args)  throws Exception{           system.exit (Runtotalsortjob (args));       }}

The job default input format is Textinputformat, this is the form of Key-value, key is the row label for each row, and value is the content of each row. Can change

Job.setinputformatclass (,.... ）

In general, the output format of the mapper should be set for later use.

Learning Log---partitioner and samplers

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Learning Log---partitioner and samplers

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Learning Log---partitioner and samplers

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support