Often we can use eclipse as the development platform for Hadoop programs.
1) Download eclipse
Download Address: http://www.eclipse.org/downloads/
Depending on the operating system type, select the appropriate version to download and install.
2) Download and compile the Eclipse plugin for Hadoop
The Eclipse plugin for Hadoop 1.x can be downloaded directly to the web, but the plugin and Hadoop2.2 are not compatible and cannot be used.
The Eclipse plug-in for Hadoop 2.2 is in development, and you can download the source code and compile it directly. I encountered some ant configuration errors during the compilation process, resulting in the inability to compile properly, modify the ant configuration is compiled successfully, for the convenience of everyone, I directly provide the compiled. jar file for everyone to download:
Code Download Address: Https://github.com/winghc/hadoop2x-eclipse-plugin
Post-compiled plugin download address: http://download.csdn.net/detail/zythy/6735167
3) Configure the Hadoop plugin
Place the downloaded Hadoop-eclipse-plugin-2.2.0.jar file in Eclipse's Dropins directory and restart Eclipse to see that the plugin is in effect.
Open the map reduce view by opening the perspective menu, as follows:
Select the elephant icon and right-click on edit Hadoop location editing Hadoop configuration information:
Fill in the correct map/reduce and HDFs information. (Depends on your configuration)
4) New Simple MapReduce Project
Create a new Map/reduce project from the wizard. In this procedure, configure the installation path for Hadoop.
5) Access DFS through eclipse
Open the resource view to see DFS:
At this point, you can do some work on DFS, such as uploading local files to HDFs, and so on, as shown below:
At this point, the Hadoop development environment is roughly configured, and the next section we'll see how to write a simple mapreduce program and put it on the Hadoop cluster.
Attached to my own local environment of several configuration files for your reference, the error please correct me, thank you.
1) core-stie.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
2) Hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/var/data/hadoop/hdfs/nn</value>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>file:/var/data/hadoop/hdfs/snn</value>
</property>
<property>
<name>fs.checkpoint.edits.dir</name>
<value>file:/var/data/hadoop/hdfs/snn</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/var/data/hadoop/hdfs/dn</value>
</property>
</configuration>
3) Mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
4) Yarn-site.xml
<configuration>
<!--Site specific YARN configuration Properties--
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>