My development environment:
Operating System centos5.5 one namenode two datanode
Hadoop version: hadoop-0.20.203.0
Eclipse version: eclipse-java-helios-SR2-linux-gtk.tar.gz (with version 3.7 always crashes, depressing)
Step 1: Start the hadoop daemon first
See http://www.cnblogs.com/flyoung2008/archive/2011/11/29/2268302.html for details
Step 2: Install the hadoop plug-in on Eclipse
1. Copy the hadoop installation directory/contrib/eclipse-plugin/hadoop-0.20.203.0-eclipse-plugin.jar to the eclipse installation directory/plugins.
2. Restart eclipse and configure hadoop installation directory.
If the plugin is successfully installed, open window --> preferens and you will find the hadoop MAP/reduce option. In this option, you need to configure hadoop installation directory. After the configuration is complete, exit.
3. Configure MAP/reduce locations.
Open Map/reduce locations in window --> show view.
Create a new hadoop location in MAP/reduce locations. In this view, right-click --> New hadoop location. In the pop-up dialog box, you need to configure the location name, such as hadoop, MAP/reduce master, and DFS master. The host and port here are the addresses and ports you configured in the mapred-site.xml and core-site.xml respectively. For example:
MAP/reduce master
192.168.1.101
9001
DFS master
192.168.1.101
9000
Exit after configuration. Click DFS locations --> hadoop. If the folder is displayed (2), the configuration is correct. If "no connection" is displayed, check your configuration.
Step 3: Create a project.
File --> New --> Other --> MAP/reduce project
Project names can be retrieved as needed, such as wordcount.
Copy the hadoop installation directory/src/example/org/Apache/hadoop/example/wordcount. Java to the project you just created.
Step 4: Upload the simulated data folder.
To run the program, we need an Input Folder and an output folder.
Create a new word.txt in idea
java c++ python c
java c++ javascript
helloworld hadoop
mapreduce java hadoop hbase
Use the hadoop command to create the/tmp/workcount directory on HDFS. The command is as follows: Bin/hadoop FS-mkdir/tmp/wordcount
Copy the corresponding word.txt to HDFS using the copyfromlocalcommand. The command is as follows: Bin/hadoop FS-copyfromlocal/home/GRID/word.txt/tmp/wordcount/word.txt
Step 5: run the project
1. In the new project hadoop, click wordcount. Java, right-click --> Run as --> RUN deployments
2. In the pop-up run deployments dialog box, click Java application, right-click --> new, and a new application named wordcount will be created.
3. Configure the running parameters, click arguments, and enter "the Input Folder you want to pass to the program and the folder you want the program to save the computing result" in program arguments, for example:
hdfs://centos1:9000/tmp/wordcount/word.txt hdfs://centos1:9000/tmp/wordcount/out
4. If Java. Lang. outofmemoryerror: Java heap space configures VM arguments (under program arguments)
-Xms512m -Xmx1024m -XX:MaxPermSize=256m
5. Click Run to run the program.
Click Run to run the program. After a period of time, the running is completed. After the running is completed, view the running result and run the following command: bin/hadoop FS-ls/tmp/wordcount/out view the output results of the example and find there are two folders and a file, use the command to view the part-r-00000 file, bin/hadoop FS-CAT/tmp/wordcount/out/part-r-00000 to view the running results.
c 1
c++ 2
hadoop 2
hbase 1
helloworld 1
java 3
javascript 1
mapreduce 1
python 1