Introduction to the MapReduce wordcount comment

Source: Internet
Author: User
Tags map class

MapReduce version: 0.2.0 ago

Description

This comment is an article that was found in previous studies and is now only modified and added to this comment after getting started.

Because of the version issue, the code does not run in a clustered environment, just as a reference to understanding MapReduce.

Remember, this version is the 0.2.0 version, please distinguish it clearly!

Body:

  

 PackageOrg.apache.hadoop.examples;Importjava.io.IOException;ImportJava.util.Iterator;ImportJava.util.StringTokenizer;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapred.FileInputFormat;ImportOrg.apache.hadoop.mapred.FileOutputFormat;Importorg.apache.hadoop.mapred.JobClient;Importorg.apache.hadoop.mapred.JobConf;Importorg.apache.hadoop.mapred.MapReduceBase;ImportOrg.apache.hadoop.mapred.Mapper;ImportOrg.apache.hadoop.mapred.OutputCollector;ImportOrg.apache.hadoop.mapred.Reducer;ImportOrg.apache.hadoop.mapred.Reporter;ImportOrg.apache.hadoop.mapred.TextInputFormat;ImportOrg.apache.hadoop.mapred.TextOutputFormat; Public classWordCount {//The map class inherits from Mapreducebase and implements the Mapper interface, which is a canonical type. //It has 4 kinds of parameters, which are used to specify the input key, value type, output key, value type of the map, respectively .     Public Static classMapextendsMapreducebaseImplementsmapper<longwritable, text, text, intwritable>     {                        Private Final StaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText (); //implement the Map method to process the input values. (used here to remove spaces)         Public voidmap (longwritable key, Text value, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {String line=value.tostring (); StringTokenizer Tokenizer=NewStringTokenizer (line);  while(Tokenizer.hasmoretokens ()) {Word.set (Tokenizer.nexttoken ());                Output.collect (Word, one); }            }    }    /*the//reduce class is also inherited from Mapreducebase and needs to implement the Reducer interface.    The reduce class takes the output of the map as input, so the input type of reduce is <Text,Intwritable>.    The output of reduce is a word and its number, so its output type is <Text,IntWritable>.    The reduce class also implements the reduce method, in which the reduce function takes the input key value as the key value of the output, and then gets multiple value values added together as the output value. */     Public Static classReduceextendsMapreducebaseImplementsReducer<text, Intwritable, Text, intwritable>     {         Public voidReduce (Text key, iterator<intwritable>values, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {intsum = 0;  while(Values.hasnext ()) {sum+=Values.next (). get (); } output.collect (Key,Newintwritable (sum)); }    }     Public Static voidMain (string[] args)throwsException {//1. Initializing a MapReduce job with the Jobconf classjobconf conf =NewJobconf (WordCount.class); //Call the Setjobname () method to name the jobConf.setjobname ("WordCount"); //setup2: Sets the key and value data types in the job output <key,value>, because the result is < Word, number >//so the key is set to the "Text" type, which is equivalent to the string type in Java. Conf.setoutputkeyclass (Text.class); //The value is set to "intwritable", which is equivalent to the int type in Java. Conf.setoutputvalueclass (intwritable.class); //SETUP3: Specifies the job's mapreduce, and Combiner//set map for job processing (split)Conf.setmapperclass (Map.class); //set the combiner of job processing (intermediate result merging, where the reduce class is used to merge the intermediate results generated by the map to avoid the pressure on the network data transmission. )You can also not set (already default) Conf.setcombinerclass (Reduce.class); //set reduce (merge) for job processingConf.setreducerclass (Reduce.class); //specifies the input and output path, which can be configured on the project by right-clicking->run as->run configuration->arguments->program argumentsthat is, the string[] args assignment in main (string[] args)//Specify InputpathsEg:hdfs://master:9000/input1/Fileinputformat.setinputpaths (Conf,NewPath (args[0])); //Specify OutputPathsEg:hdfs://master:9000/input1/Fileoutputformat.setoutputpath (Conf,NewPath (args[1]));    Jobclient.runjob (conf); }}

  

Introduction to the MapReduce wordcount comment

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.