1. Program code
MAP:
Importjava.io.IOException;Importorg.apache.hadoop.io.IntWritable;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Mapper;Importorg.apache.hadoop.util.StringUtils; Public classWordcountmapperextendsmapper<longwritable, text, text, intwritable> { protected voidmap (longwritable key, Text value,context Context)throwsIOException, interruptedexception {string[] words= Stringutils.split (Value.tostring (), "); for(String word:words) {context.write (NewText (Word),NewIntwritable (1)); } }}
Reduce:
Importjava.io.IOException;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Reducer;ImportOrg.apache.hadoop.mapreduce.Reducer.Context; Public classWordcountreducerextendsReducer<text, Intwritable, Text, intwritable> { protected voidReduce (Text arg0, iterable<intwritable>arg1,context arg2)throwsIOException, interruptedexception {intsum = 0; for(intwritable i:arg1) {sum+=I.get (); } arg2.write (arg0,Newintwritable (sum)); } }
Main:
Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.FileSystem;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat; Public classRunjob { Public Static voidMain (string[] args) {Configuration config=NewConfiguration (); Try{FileSystem fs=filesystem.get (config); Job Job=job.getinstance (config); Job.setjobname ("WordCount"); Job.setjarbyclass (runjob.class); Job.setmapperclass (wordcountmapper.class); Job.setreducerclass (wordcountreducer.class); Job.setmapoutputkeyclass (Text.class); Job.setmapoutputvalueclass (intwritable.class); Fileinputformat.addinputpath (Job,NewPath ("/usr/input/")); Path Outpath=NewPath ("/usr/output/wc/"); if(Fs.exists (Outpath)) {Fs.delete (Outpath,true); } fileoutputformat.setoutputpath (Job, Outpath); Boolean result= Job.waitforcompletion (true); if(Result) {System.out.println ("Job is complete!"); }Else{System.out.println ("Job is fail!"); } } Catch(Exception e) {e.printstacktrace (); } }}
2. Packing procedure
Make the Java program into a jar package and upload it to the Hadoop server (any Namenode node on the boot)
3. Data source
The data source is as follows:
Hadoop java text hdfstom Jack Java textjob hadoop ABC lusihdfs Tom text
Put the content in a TXT file and put it in HDFs/usr/input (under HDFs, not Linux), and you can upload it using the Eclipse plugin:
4. Execute JAR Package
# fully qualified name of the Hadoop Jar Jar Path Class (Hadoop requires configuration environment variable)$ hadoop jar Wc.jar Com.raphael.wc.RunJob
After execution, a new output directory is created in the HDFs/usr:
To view the execution results:
Abc1hadoop2hdfs2jack1java2job1lusi1text3tom2
Completed the statistics of the number of words.
Hadoop 6, the first MapReduce program WordCount