First, the Data processing class
PackageCom.css.hdfs;ImportJava.io.BufferedReader;Importjava.io.IOException;ImportJava.io.InputStreamReader;ImportJava.net.URI;Importjava.net.URISyntaxException;ImportJava.util.HashMap;ImportJava.util.Map.Entry;Importjava.util.Properties;ImportJava.util.Set;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.FSDataInputStream;ImportOrg.apache.hadoop.fs.FSDataOutputStream;ImportOrg.apache.hadoop.fs.FileSystem;ImportOrg.apache.hadoop.fs.LocatedFileStatus;ImportOrg.apache.hadoop.fs.Path;ImportOrg.apache.hadoop.fs.RemoteIterator;/*** Need: File (Hello world Hello teacher Hello John Tom) count the number of occurrences of each word? * Data stored in HDFS, statistical results stored in HDFS * * * * * * * * * * 2004google:dfs/bigtable/mapreduce * 1. Storage of MASSIVE data * HDFS * 2. Calculation of MASSIVE data * mapreduce * * idea? * Hello 2 * World 1 * Hello 1 * ... * * Based on user experience: * User input data * user-processed way * User specified result data storage location*/ Public classHdfswordcount { Public Static voidMain (string[] args)throwsIOException, ClassNotFoundException, Instantiationexception, Illegalaccessexception, InterruptedException, urisyntaxexception {//ReflectionProperties Pro =NewProperties (); //Load configuration filePro.load (Hdfswordcount.class. getClassLoader (). getResourceAsStream ("Job.properties")); Path Inpath=NewPath (Pro.getproperty ("In_path")); Path Outpath=NewPath (Pro.getproperty ("Out_path")); Class<?> Mapper_class = Class.forName (Pro.getproperty ("Mapper_class")); //instantiation ofMapper Mapper =(Mapper) mapper_class.newinstance (); Context Context=NewContext (); //Building HDFs Client ObjectsConfiguration conf =NewConfiguration (); FileSystem FS= Filesystem.get (NewURI ("Hdfs://192.168.146.132:9000/"), conf, "root"); //read user-entered FilesRemoteiterator<locatedfilestatus> iter = Fs.listfiles (Inpath,false); while(Iter.hasnext ()) {Locatedfilestatus file=Iter.next (); //Open path get input streamFsdatainputstream in =Fs.open (File.getpath ()); BufferedReader BR=NewBufferedReader (NewInputStreamReader (In, "Utf-8")); String Line=NULL; while(line = Br.readline ())! =NULL) { //calling the map method to execute the business logicMapper.map (line, context); } //Close ResourceBr.close (); In.close (); } //If the user enters a result path that does not exist, create aPath out =NewPath ("/wc/out/"); if(!fs.exists (out)) {fs.mkdirs (out); } //Store the cached results in HDFsHashmap<object, object> contextmap =Context.getcontextmap (); Fsdataoutputstream OUT1=fs.create (Outpath); //Traverse HashMapSet<entry<object, object>> entryset =Contextmap.entryset (); for(Entry<object, object>Entry:entryset) { //Write DataOut1.write ((Entry.getkey (). toString () + "\ T" + entry.getvalue () + "\ n"). GetBytes ()); } //Close ResourceOut1.close (); Fs.close (); System.out.println ("Data statistics results ..."); }}
Second, the interface class
Package Com.css.hdfs; /** */Publicinterface Mapper {// call method Public void Map (String line, context context);}
Third, the data transmission class
PackageCom.css.hdfs;ImportJava.util.HashMap;/*** Idea: * Data transmission class * Package Data * Collection * < word,1>*/ Public classContext {//Data Encapsulation PrivateHashmap<object, object> contextmap =NewHashmap<>(); //Write Data Public voidwrite (Object key, Object value) {//put the data in the mapcontextmap.put (key, value); } //define how to get a value based on key Publicobject get (Object key) {returnContextmap.get (key); } //get the data content in the map PublicHashmap<object, object>Getcontextmap () {returnContextmap; }}
Four, the word Count class
PackageCom.css.hdfs;/*** Idea: * Add a map method word to slice the same key value + +*/ Public classWordcountmapperImplementsmapper{@Override Public voidMap (String line, context context) {//get this slice of data.string[] Words = Line.split (""); //get the word same key value++ Hello 1 World 1 for(String word:words) {Object value=context.get (word); if(NULL==value) {Context.write (Word,1); }Else { //is not empty intv = (int) value; Context.write (Word, v+1); } } }}
V. Configuration file Job.properties
in_path=/wc/inout_path=/wc/out/rs.txtmapper_class=com.css.hdfs.wordcountmapper
HDFS handwritten mapreduce Word counting framework