Requirements: Find the average of multiple courses.
Model: Math.txt
Zhangsan 90
Lisi 88
Wanghua 80
China.txt
Zhangsan 80
Lisi 90
Wanghua 88
Output: Zhangsan 85
Lisi 89
Wanghua 84
Analysis section:
Mapper Partial Analysis:
1, <K1,V1>K1 representative: The number of rows of data location, V1 representative: A row of data.
2, <K2,V2>K2 representative: First Name, V2 representative: score.
Reduce section Analysis:
3, <K3,V3>K3 representative: The same key (name), V3 representative:list<int>.
4, Statistical Output <K4,V4>K4 representative: First Name, V4 representative: average.
Program section:
Averagemapper class:
PackageCom.cn.average;Importjava.io.IOException;ImportJava.util.StringTokenizer;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Mapper; Public classAveragemapperextendsMapper<object, text, text, intwritable>{@Overrideprotected voidmap (Object key, Text value, context context)throwsIOException, interruptedexception {String [] strings=NewString[2]; inti = 0; String Line=value.tostring (); StringTokenizer Tokenizerval=NewStringTokenizer (line); while(Tokenizerval.hasmoreelements ()) {Strings[i]=Tokenizerval.nexttoken (); I++; } context.write (NewText (Strings[0]),NewIntwritable (Integer.parseint (strings[1]))); }}
Averagereduce class:
PackageCom.cn.average;Importjava.io.IOException;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Reducer; Public classAveragereduceextendsReducer<text, Intwritable, Text, intwritable>{@Overrideprotected voidReduce (Text key, iterable<intwritable> values,context Context)throwsIOException, interruptedexception {intsum = 0; inti = 0; for(intwritable value:values) {sum+=Value.get (); I++; } context.write (Key,NewIntwritable (sum/i)); }}
Dataaverage class:
PackageCom.cn.average;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;ImportOrg.apache.hadoop.util.GenericOptionsParser;/*** Average *@authorRoot **/ Public classDataaverage { Public Static voidMain (string[] args)throwsException {Configuration conf=NewConfiguration (); String[] Otherargs=Newgenericoptionsparser (conf, args). Getremainingargs (); if(Otherargs.length! = 2) {System.err.println ("Usage:dataaverage"); System.exit (2); } //Create a jobJob Job =NewJob (conf, "Data Average")); Job.setjarbyclass (dataaverage.class); //setting the input and output path of a fileFileinputformat.addinputpath (Job,NewPath (otherargs[0])); Fileoutputformat.setoutputpath (Job,NewPath (otherargs[1])); //set up mapper and reduce processing classesJob.setmapperclass (Averagemapper.class); Job.setreducerclass (averagereduce.class); //Setting the output key-value data typeJob.setoutputkeyclass (Text.class); Job.setoutputvalueclass (intwritable.class); //submit the job and wait for it to completeSystem.exit (Job.waitforcompletion (true) ? 0:1); }}
Summing up a little every day, there is always a different harvest.
Average of the Hadoop program MapReduce