Introduction to the MapReduce wordcount comment

Last Update:2015-12-18 Source: Internet

Author: User

Tags map class

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MapReduce version: 0.2.0 ago

Description

This comment is an article that was found in previous studies and is now only modified and added to this comment after getting started.

Because of the version issue, the code does not run in a clustered environment, just as a reference to understanding MapReduce.

Remember, this version is the 0.2.0 version, please distinguish it clearly!

Body:

 PackageOrg.apache.hadoop.examples;Importjava.io.IOException;ImportJava.util.Iterator;ImportJava.util.StringTokenizer;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapred.FileInputFormat;ImportOrg.apache.hadoop.mapred.FileOutputFormat;Importorg.apache.hadoop.mapred.JobClient;Importorg.apache.hadoop.mapred.JobConf;Importorg.apache.hadoop.mapred.MapReduceBase;ImportOrg.apache.hadoop.mapred.Mapper;ImportOrg.apache.hadoop.mapred.OutputCollector;ImportOrg.apache.hadoop.mapred.Reducer;ImportOrg.apache.hadoop.mapred.Reporter;ImportOrg.apache.hadoop.mapred.TextInputFormat;ImportOrg.apache.hadoop.mapred.TextOutputFormat; Public classWordCount {//The map class inherits from Mapreducebase and implements the Mapper interface, which is a canonical type. //It has 4 kinds of parameters, which are used to specify the input key, value type, output key, value type of the map, respectively .     Public Static classMapextendsMapreducebaseImplementsmapper<longwritable, text, text, intwritable>     {                        Private Final StaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText (); //implement the Map method to process the input values. (used here to remove spaces)         Public voidmap (longwritable key, Text value, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {String line=value.tostring (); StringTokenizer Tokenizer=NewStringTokenizer (line);  while(Tokenizer.hasmoretokens ()) {Word.set (Tokenizer.nexttoken ());                Output.collect (Word, one); }            }    }    /*the//reduce class is also inherited from Mapreducebase and needs to implement the Reducer interface.    The reduce class takes the output of the map as input, so the input type of reduce is <Text,Intwritable>.    The output of reduce is a word and its number, so its output type is <Text,IntWritable>.    The reduce class also implements the reduce method, in which the reduce function takes the input key value as the key value of the output, and then gets multiple value values added together as the output value. */     Public Static classReduceextendsMapreducebaseImplementsReducer<text, Intwritable, Text, intwritable>     {         Public voidReduce (Text key, iterator<intwritable>values, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {intsum = 0;  while(Values.hasnext ()) {sum+=Values.next (). get (); } output.collect (Key,Newintwritable (sum)); }    }     Public Static voidMain (string[] args)throwsException {//1. Initializing a MapReduce job with the Jobconf classjobconf conf =NewJobconf (WordCount.class); //Call the Setjobname () method to name the jobConf.setjobname ("WordCount"); //setup2: Sets the key and value data types in the job output <key,value>, because the result is < Word, number >//so the key is set to the "Text" type, which is equivalent to the string type in Java. Conf.setoutputkeyclass (Text.class); //The value is set to "intwritable", which is equivalent to the int type in Java. Conf.setoutputvalueclass (intwritable.class); //SETUP3: Specifies the job's mapreduce, and Combiner//set map for job processing (split)Conf.setmapperclass (Map.class); //set the combiner of job processing (intermediate result merging, where the reduce class is used to merge the intermediate results generated by the map to avoid the pressure on the network data transmission. )You can also not set (already default) Conf.setcombinerclass (Reduce.class); //set reduce (merge) for job processingConf.setreducerclass (Reduce.class); //specifies the input and output path, which can be configured on the project by right-clicking->run as->run configuration->arguments->program argumentsthat is, the string[] args assignment in main (string[] args)//Specify InputpathsEg:hdfs://master:9000/input1/Fileinputformat.setinputpaths (Conf,NewPath (args[0])); //Specify OutputPathsEg:hdfs://master:9000/input1/Fileoutputformat.setoutputpath (Conf,NewPath (args[1]));    Jobclient.runjob (conf); }}

Introduction to the MapReduce wordcount comment

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More