One, Eclipse Hadoop environment configuration
1. In the environment variable, advanced system settings , properties , right-click My Computer , configure environment variables:
java_home=d:\programfiles\java\jdk1.7.0_67 ,
hadoop_home=d:\tedp_software\hadoop-2.4.0,
Path=.; %java_home%\bin;%hadoop_home%\bin;
2. Install the Hadoop-eclipse-kepler-plugin-2.2.0.jar plugin in Eclipse and Configure the Hadoop Server
Second, the WordCount procedure
1. Prepare the test file[[email protected] hadoop]# mkdir file
[[Email protected] hadoop]# CD file
[[email protected] file]# ls
[Email protected] file]# echo "Hello World" >file1.txt
[Email protected] file]# echo "Hello Hadoop" >file2.txt
2. Enter a folder
Create Hadoop folder: Hadoop fs-mkdir/user permissions settings: Hadoop fs-chmod-r 777/user Create input folder: Hadoop fs-mkdir/user/input View folder: Hadoop FS -ls/upload file to Hadoop:hadoop fs-put ~/file/file*.txt/user/input error 1:Java.net.NoRouteToHostException:No route to host(or in hive: Could only is replicated to 0 nodes instead of minreplication (=1). There is 2 Datanode (s) Running and 2 node (s) is excluded in this operation.) The firewall is not turned off: Each host switches to root, performing service iptables stop
3. New Mr Project,Copy the attachment Wordcount.java into theWordCountRight-click on the class->run as->run configurations,Enter the following parameter information:hdfs://192.168.1.200:9000/user/input hdfs://192.168.1.200:9000/user/output
4.Run on Hadoop(1) Exception information 1:Exception in thread "main" Java.lang.NullPointerExceptionWorkaround:Baidu said that it wasHadoopin theWindowson the oneBUG, inLinuxthere's no problem.DownloadHadoop-common-2.2.0-bin-master.zip, after decompression will
Replace the file in the bin with the. \hadoop-2.4.0\bin,
and Will bin in the Hadoop.dll Copy to C:\Windows\System32 , restart the computer.
(2) Exception information 2: 14/12/02 21:01:01 ERROR util. Shell:failed to locate the winutils binary in the Hadoop binary path
Java.io.IOException:Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
Workaround: Configure the local environment variable: hadoop_home =d:\soft\linux\hadoop-2.4.0 requires reboot,
Do not want to restart the words in the Code add: System.setproperty ("Hadoop.home.dir", "d:\\soft\\linux\\hadoop-2.4.0");(3) Exception information 3:Exception in thread "main" Org.apache.hadoop.mapred.FileAlreadyExistsException:Output directory hdfs:// 192.168.1.200:9000/user/output already exists
Workaround: The output folder already exists, modify the export folder or between outputs deleted
(4) exception information 4:[97;97;98;99;13p[0m then did not respond (this is the error that occurred when creating a second Hadoop program later )
Workaround: Find MainClass as JLine in the run Configurations->main. Ansibuffer, change to WordCount, then click "Run" to
Note: If you are executing with the run on Hadoop menu on the run as, enter or select WordCount when selecting Select Type on the popup page;
5.OK Operation Result:
Hello 2
Hadoop1
World1
6. Annex: WordCount. java files import Java.io.ioexception;import Java.util.*; import Org.apache.hadoop.fs.path;import Org.apache.hadoop.conf.*;import Org.apache.hadoop.io.*;import Org.apache.hadoop.mapred.*;import Org.apache.hadoop.util.*; public class WordCount { public static class Map extends Mapreducebase implements mapper<longwritable, text, text, intwritable> { private final static intwritable one = New Intwritable (1); private Text word = new text (); public void map (longwritable key, text value, Outputcollector<text, intwritable> output, Reporter Reporter) throws IOException {   ; String line = value.tostring (); stringtokenizer tokenizer = new StringTokenizer (line); while ( Tokenizer.hasmoretokens ()) { Word.set (Tokenizer.nexttoken ()); Output.collect (Word, one); } } } public Static class Reduce extends Mapreducebase Implements reducer<text, intwritable, text, intwritable> { public void reduce (Text key, iterator< intwritable> values, Outputcollector<text, intwritable> output, Reporter Reporter) Throws IOException { int sum = 0; while (Values.hasnext ()) { sum + = Values.next (). Get (); } output.collect (Key, New intwritable (sum)); } } public static void Main (string[] args) throws Exception { //system.setproperty ("Hadoop.home.dir", "d:\\soft\\linux\\ hadoop-2.4.0 "); jobconf conf = new jobconf (wordcount.class); conf.setjobname (" WordCount "); Conf.setoutputkeyclass (Text.class); conf.setoutputvalueclass (intwritable.class); Conf.setmapperclass (Map.class); conf.setcombinerclass (reduce.class); Conf.setReducerClass ( Reduce.class); Conf.setinputformat (textinputformat.class); CONF.SETOUTPUTFORmat (Textoutputformat.class); fileinputformat.setinputpaths (conf, new Path (Args[0]); Fileoutputformat.setoutputpath (conf, new Path (Args[1])); jobclient.runjob (conf); }}
This article references: http://www.cnblogs.com/xia520pi/archive/2012/05/16/2504205.html
Finish
First Hadoop program (hadoop2.4.0 cluster +eclipse environment)