0. Preface
This article refer to the blog: http://www.51itong.net/eclipse-hadoop2-7-0-12448.html
Before setting up the development environment, we have built up the pseudo-distribution of Hadoop. Refer to the previous blog:
http://blog.csdn.net/xummgg/article/details/51173072
1. Download and install Eclipse
Download URL: http://www.eclipse.org/downloads/
Because it runs under Ubuntu, it downloads Linux 64 as a version (Java EE), which is downloaded by default on the current user's downloads.
Unzip the command as follows:
After decompression can be seen under the/usr/local:
Because, to join the new jar package into eclipse, so put Ecplise folder permissions, set high permissions.
2. Download the Hadoop plugin
2.6.4 Plugin Hadoop-eclipse-plugin-2.6.4.jar:
http://download.csdn.net/download/tondayong1981/9437360
When the download is complete, put the plugin into the Eclipse/plugins directory
Use sudo to enter the user password.
3. Set up eclipse
Run Eclipse
Open Window->preferences
Can see a Hadoop map/reduce, set the native Hadoop directory, my directory when/usr/local/hadoop/hadoop-2.6.4/, as shown:
4. Configure Map/reduce Locations
Note: Before configuring Hadoop in the background, start Hadoop pseudo-distributed DFS and yarn, refer to the previous blog.
Open Windows-open Perspective-other in eclipse
Select Map/reduce, click OK
In the lower right, see as shown
Click on the Map/reduce Location tab and click on the blue icon on the right to open the Hadoop location Configuration window.
输入Location Name,任意名称即可.配置Map/Reduce Master和DFS Mastrer,Host和Port配置成与core-site.xml的设置一致即可。如:
Click the "Finish" button to close the window.
Click Dfslocations->myhadoop on the left (location name in the previous step), and if you can see user, the installation is successful. This allows eclipse to connect to a distributed file system that can be viewed in eclipse for easy programming.
5. New WordCount Project
点击File—>Project:
Select Map/reduce Project and click Next to proceed to the next step:
Enter the project name WordCount, click Finish to finish:
In the WordCount project, right-click SRC New class, package name COM.XXM (please self-command), the class name is WordCount:
The code is as follows:
Packagecom. XXM;//change to your own BaomingImport Java. IO. IOException;Import Java. Util. StringTokenizer;import org. Apache. Hadoop. conf. Configuration;import org. Apache. Hadoop. FS. Path;import org. Apache. Hadoop. IO. Intwritable;import org. Apache. Hadoop. IO. Text;import org. Apache. Hadoop. MapReduce. Job;import org. Apache. Hadoop. MapReduce. Mapper;import org. Apache. Hadoop. MapReduce. Reducer;import org. Apache. Hadoop. MapReduce. Lib. Input. Fileinputformat;import org. Apache. Hadoop. MapReduce. Lib. Output. Fileoutputformat;import org. Apache. Hadoop. Util. Genericoptionsparser;public class WordCount {public static class Tokenizermapper extends Mapper<object, text, text, intwritable> {Private final static intwritable one = new Intwritable (1);Private text Word = new text ();public void Map (Object key, Text value, Context context) throws IOException, Interruptedexception { StringTokenizer ITR = new StringTokenizer (value. toString());while (ITR. Hasmoretokens()) {Word. Set(ITR. NextToken());Context. Write(Word, one);}}} public static class Intsumreducer extends Reducer<text,intwritable,text,intwritable> {private intwritable result = new Intwritable ();public void reduce (Text key, iterable<intwritable> values, context context ) throws IOException, interruptedexception {int sum =0;for (intwritable val:values) {sum + = Val. Get();} result. Set(sum);Context. Write(Key, result);}} public static void Main (string[] args) throws Exception {configuration conf = new Configuration ();string[] Otherargs = new Genericoptionsparser (conf, args). Getremainingargs();if (Otherargs. Length!=2) {System. Err. println("Usage:wordcount <in> <out>");System. Exit(2);} Job Job = new Job (conf,"Word Count");Job. Setjarbyclass(WordCount. Class);Job. Setmapperclass(Tokenizermapper. Class);Job. Setcombinerclass(Intsumreducer. Class);Job. Setreducerclass(Intsumreducer. Class);Job. Setoutputkeyclass(Text. Class);Job. Setoutputvalueclass(intwritable. Class);Fileinputformat. Addinputpath(Job, New Path (otherargs[0]));Fileoutputformat. Setoutputpath(Job, New Path (otherargs[1]));System. Exit(Job. WaitForCompletion(true)?0:1);}}
6. Running
Before running to ensure that the Distributed file system in the input directory file, if the HDFS has just been formatted, also refer to the "Build Hadoop pseudo-distributed" tutorial, create and upload input files and content.
In the WordCount code area, right click on Run As->run configurations, configure the run parameters, i.e. input and output folder address parameters:
Hdfs://localhost:9000/user/xxm/input Hdfs://localhost:9000/user/xxm/output/wordcount3
As shown in the following:
Click Run.
Results can be re-connected Myhadoop after entering output double click to view. You can also use the HDFS command get to see below. Re-connect Myhadoop method: In the Project Management window, right-click the Blue Elephant, choose Reconnect.
The Eclipse development environment for Hadoop has been built.
7. Associated Hadoop Source code
or download the source code on the Hadoop download site:
Http://hadoop.apache.org/releases.html
Download and unzip to the/usr/hadoop folder:
As shown, select a Hadoop inwritable function and right-click to view the source program:
The source program can not be found, so to do the source program Association. Click Attach Source:
Associate the HADOOP-2.6.4-SRC source, click OK:
Jump after you can see the inwritable function can see the source code:
To this source is associated successfully.
Xianming
Hadoop Learning Notes (4) Building Hadoop2.6.4 development environment under-eclipse