Next "Ubuntu Kylin system Installation Hadoop2.6.0"
In the previous article, Hadoop Pseudo-distributed is basically well-equipped.
The next step is to run a mapreduce program, taking WordCount as an example:
1. Build the Implementation class:
Cd/usr/local/hadoopmkdir Workspace
CD Workspace
Gedit Wordcount.java
Copy and paste the code.
import java.io.ioexception;import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; Import Org.apache.hadoop.fs.path;import Org.apache.hadoop.io.intwritable;import Org.apache.hadoop.io.text;import Org.apache.hadoop.mapreduce.job;import Org.apache.hadoop.mapreduce.mapper;import Org.apache.hadoop.mapreduce.reducer;import Org.apache.hadoop.mapreduce.lib.input.fileinputformat;import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; Public classWordCount { Public Static classTokenizermapper extends Mapper<object, text, text, intwritable>{ PrivateFinalStaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText (); Public voidmap (Object key, Text value, Context context) throws IOException, interruptedexception {stri Ngtokenizer ITR=NewStringTokenizer (value.tostring ()); while(Itr.hasmoretokens ()) {Word.Set(Itr.nexttoken ()); Context.write (Word, one); } } } Public Static classIntsumreducer extends Reducer<Text,IntWritable,Text,IntWritable> { Privateintwritable result =Newintwritable (); Public voidReduce (Text key, iterable<intwritable>values, Context context) throws IOException, Interruptedexception { intsum =0; for(intwritable val:values) {sum+ = val.Get(); } result.Set(sum); Context.write (key, result); } } Public Static voidMain (string[] args) throws Exception {Configuration conf=NewConfiguration (); Job Job= Job.getinstance (conf,"Word Count"); Job.setjarbyclass (WordCount.class); Job.setmapperclass (tokenizermapper.class); Job.setcombinerclass (intsumreducer.class); Job.setreducerclass (intsumreducer.class); Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (intwritable.class); Fileinputformat.addinputpath (Job,NewPath (args[0])); Fileoutputformat.setoutputpath (Job,NewPath (args[1])); System.exit (Job.waitforcompletion (true) ?0:1); }}
For the specific analysis of the code, the next article is explained in detail.
2. Compiling
(1) Add Java_home
Export Java_home=/usr/lib/jvm/java-8u5-sun
Forget java_home can be used:
Echo $JAVA _home
(2) Add the Bin folder under the JDK directory to the environment variable
Export path= $JAVA _home/bin: $PATH
(3) Adding Hadoop_classpath to environment variables
Export Hadoop_classpath= $JAVA _home/lib/tools.jar
Compiling the Wordcount.java file
.. /bin/hadoop Com.sun.tools.javac.Main Wordcount.java
Where Com.sun.tools.javac.Main is an instance of building a compiler
The above statement generates three Class:WordCount.class Reducer.class Tokenizermapper.class
Package the above three classes into a. jar Package
Jar CF Wordcount.jar Wordcount*.class
Generate Wordcount.jar
3. Running
Bin/hdfs Dfs-mkdir/userbin/hdfs Dfs-mkdir/user/hadoop
To construct the input file:
Bin/hdfs Dfs-put Etc/hadoop/input
Where Etc/hadoop is the input file and can be replaced with other files
View Run Results
Bin/hdfs dfs-cat/output/*
4. End Hadoop
sbin/stop-dfs.sh
Use terminal to run Hadoop program in Ubuntu