Build Hadoop2.4.0 development environment under Eclipse

Last Update:2015-01-27 Source: Internet

Author: User

Tags hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Build Hadoop2.4.0 development environment under Eclipse

1. Install Eclipse

Download Eclipse, decompress and install it, for example, to/usr/local, that is,/usr/local/eclipse

4.3.1: http://pan.baidu.com/s/1gd29RPp

2. Install the Hadoop plug-in on eclipse

1. Download The hadoop plug-in

: Http://pan.baidu.com/s/1gd29RPp

This zip file contains the source code, we can use the compiled jar, unzip, the release folder in the hadoop. eclipse-kepler-plugin-2.2.0.jar is the compiled plug-in.

2. Put the plug-in under the eclipse/plugins directory

3. Restart eclipse and configure Hadoop installation directory

If the plug-in is successfully installed, open Windows-Preferences and the Hadoop Map/Reduce option on the left side of the window. Click this option to set the Hadoop installation path on the right side of the window.

4. Configure Map/Reduce Locations

Open Windows-Open Perspective-Other

Select Map/Reduce and click OK.

As shown in figure

Click the Map/Reduce Location tab, and click the elephant icon on the right to open the Hadoop Location Configuration window:

Enter Location Name, any Name. Configure Map/Reduce Master and DFS Mastrer, Host and Port to be consistent with the core-site.xml settings.

Click "Finish" to close the window.

Click DFSLocations-> myhadoop (location name configured in the previous step) on the left. If the user is displayed, the installation is successful.

If the installation fails, check whether Hadoop is started and whether the eclipse configuration is correct.

3. Create a WordCount Project

File-> Project, select Map/Reduce Project, and enter the Project name WordCount.

Create a class in the WordCount project named WordCount. The Code is as follows:

Import java. io. IOException;

Import java. util. StringTokenizer;

Import org. apache. hadoop. conf. Configuration;

Import org. apache. hadoop. fs. Path;

Import org. apache. hadoop. io. IntWritable;

Import org. apache. hadoop. io. Text;

Import org. apache. hadoop. mapreduce. Job;

Import org. apache. hadoop. mapreduce. Mapper;

Import org. apache. hadoop. mapreduce. Cer CER;

Import org. apache. hadoop. mapreduce. lib. input. FileInputFormat;

Import org. apache. hadoop. mapreduce. lib. output. FileOutputFormat;

Import org. apache. hadoop. util. GenericOptionsParser;

Public class WordCount {

Public static class TokenizerMapper extends Mapper <Object, Text, Text, IntWritable> {

Private final static IntWritable one = new IntWritable (1 );

Private Text word = new Text ();

Public void map (Object key, Text value, Context context) throws IOException, InterruptedException {

StringTokenizer itr = new StringTokenizer (value. toString ());

While (itr. hasMoreTokens ()){

Word. set (itr. nextToken ());

Context. write (word, one );

}

Public static class IntSumReducer extends Reducer <Text, IntWritable, Text, IntWritable> {

Private IntWritable result = new IntWritable ();

Public void reduce (Text key, Iterable <IntWritable> values, Context context) throws IOException, InterruptedException {

Int sum = 0;

For (IntWritable val: values ){

Sum + = val. get ();

}

Result. set (sum );

Context. write (key, result );

}

Public static void main (String [] args) throws Exception {

Configuration conf = new Configuration ();

String [] otherArgs = new GenericOptionsParser (conf, args). getRemainingArgs ();

If (otherArgs. length! = 2 ){

System. err. println ("Usage: wordcount <in> <out> ");

System. exit (2 );

}

Job job = new Job (conf, "word count ");

Job. setJarByClass (WordCount. class );

Job. setMapperClass (TokenizerMapper. class );

Job. setCombinerClass (IntSumReducer. class );

Job. setReducerClass (IntSumReducer. class );

Job. setOutputKeyClass (Text. class );

Job. setOutputValueClass (IntWritable. class );

FileInputFormat. addInputPath (job, new Path (otherArgs [0]);

FileOutputFormat. setOutputPath (job, new Path (otherArgs [1]);

System. exit (job. waitForCompletion (true )? 0: 1 );

}

Iv. Run

1. Create a directory input on HDFS

Hadoop fs-mkdir input

2 bytes faster than readme.txt to HDFS input

Hadoop fs-copyFromLocal/usr/local/hadoop/README.txt input

3. Right-click WordCount. java and choose Run As> Run Configurations to configure the running parameters, that is, the input and output folders.

Hdfs: // localhost: 9000/user/hadoop/input hdfs: // localhost: 9000/user/hadoop/output

Click the Run button to Run the program.

4. view the running result after the operation is complete.

Method 1:

Hadoop fs-ls output

There are two output results: _ SUCCESS and part-r-00000.

Run hadoop fs-cat output /*

Method 2:

Expand DFS Locations, as shown in, double-click to open the part-r00000 to view the results

Install and configure Hadoop2.2.0 on CentOS

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More