Pre-work
My eclipse is installed under Windows, through the Eclipse program to connect to Hadoop, you need to keep the virtual machine's access address and native access address in the same domain, the address of the virtual machine is described earlier in the article, if you want to change the Windows native IP address, open "Network and Sharing Center", click on the left menu "Change adapter Settings", select the appropriate connection network for IPV4 property address modification. My virtual machine address is 192.168.3.137
Preparatory work
After the address is configured, install the Hadoop plugin on eclipse (you can refer to the source code to modify it yourself).
Open the Eclipse installation path---\eclipse\plugins, and put the Hadoop-eclipse-plugin-1.1.2.jar in this directory.
Windows native New directory (I am in E:\hadoopMapReduceDir), copy all the jars from within the Linux Hadoop installation package and place them in this directory for backup.
Build a Hadoop environment on Ubuntu 13.04 http://www.linuxidc.com/Linux/2013-06/86106.htm
Ubuntu 12.10 +hadoop 1.2.1 version cluster configuration http://www.linuxidc.com/Linux/2013-09/90600.htm
Build a Hadoop environment on Ubuntu (standalone mode + pseudo distribution mode) http://www.linuxidc.com/Linux/2013-01/77681.htm
Configuration of the Hadoop environment under Ubuntu http://www.linuxidc.com/Linux/2012-11/74539.htm
A single version of the Hadoop Environment Graphics tutorial detailed http://www.linuxidc.com/Linux/2012-02/53927.htm
Build a Hadoop environment (build with virtual machine Virtual two Ubuntu system in WINODWS Environment) http://www.linuxidc.com/Linux/2011-12/48894.htm
Configuration work
Open Eclipse, click Window,showview in the menu bar and select other to open the window to display, as follows
Locate the elephant in the Mapreducetools directory and drag it to the bottom of Eclipse display (and console)
Open the New Configuration window, set the connection information, is the local connection configuration name to fill out, two ports and username fill in the diagram (Hadoop default port)
Click the Eclipse menu bar window---Preference, find Hadoopmap/reduce, click on the right side to select the path to import the Hadoop jar, so that after the new Hadoop project, the jar package will be automatically loaded from this path, The path above is set and added in :
Create a project
Click File---Other to open the New dialog box and create the Map/reduceproject project Newhadooptest
Once created, you will see that the jar package is automatically loaded into the project
Copy the example Java project under the SRC directory under Linux to the project just created under SRC,
Linux starts Hadoop (with JPS check is started), after the Eclipse connection, the Hadoop directory will be displayed as follows
Modify Code
Now that the setup aspect is basically complete, the next step is to modify and HDFs path configuration for Wordcount.java
Open Wordcount.java (may be an error), change the main method as follows:
if genericoptionsparser error, add Hadoop-core-1.1.2.jar to the project (or to the jar configuration path). Create a new a.txt file, enter the contents below, save
under Hadoopname user-"Hadoop directory Create input folder, the file a.txt upload to HDFs input, the process is as follows (do not create output directory, or execution will error,), if HDFs already has, Linux can use command hadoop fs-rmr/output Delete)
In select Wordcount.java, right-click Run as---run configurations, open arguments to fill in the input out path (note: There are spaces between the input and output paths), Here I set the number of Word occurrences for all files under query input.
Right-click Wordcount.java,run as-àrun on Hadoop execution
After execution finishes, refresh Dfs Locations/use/hadoop
Automatically generate the output folder and click to see the execution results. The final result is stored in the part-r-00000, double-click to view
Eclipse executes Hadoop WordCount