Installation configuration Hadoop (with WordCount program test) under Eclipse

Last Update:2015-04-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Here I'll show you how to configure Hadoop under Windows Setup. and test a program

The plug-ins you need to use and the respective:

First, the preparatory work

1, eclipse, preferably Java EE version, thought can change the mode.

2. Connectors for Hadoop and eclipse:

Hadoop-eclipse-plugin-1.2.1.jar (This is what I use, here you can customize the selection version)

3, is the Hadoop source package (Download the latest can).

Copy the Hadoop-0.20.2-eclipse-plugin.jar to the Eclipse/plugins directory and restart Eclipse.

two. Open the MapReduce view

Window---Open perspective, other select Map/reduce, the icon is a blue elephant.

Click the entire interface to change to MapReduce mode.

three. Add a MapReduce environment

At the lower end of eclipse, there will be a tab next to the console, called "Map/reduce Locations", right-click in the blanks below, select "New Hadoop location ...",:

In the dialog box that pops up, fill in the following:

Location name (take a name)
Map/reduce Master (the IP and port of the Job tracker, according to the mapred.job.tracker configured in Mapred-site.xml)
DFS Master (Name node IP and port, based on fs.default.name configured in core-site.xml)

four. Use Eclipse to modify HDFs content

After the previous step, the left "Project Explorer" should appear in the configuration of HDFs, right-click, you can create new folders, delete folders, upload files, download files, delete files and other operations.

Note: The changes cannot be displayed immediately after each operation in eclipse and must be refreshed.

Five. Create a MapReduce project5.1 Configuring the Hadoop path

Window---Preferences Select "Hadoop map/reduce" and click "Browse ..." to select the path to the Hadoop folder.
This step is independent of the operating environment, but it is possible to automatically import all the jar packages from the Hadoop root and the Lib directory when the new project is created.

5.2 Creating a project

Project, File, New, select Map/reduce Project, and then enter the project name to create the project. The plugin automatically imports all the jar packages from the Hadoop root and the Lib directory.

5.3 Creating mapper or reducer

Mapper, File, New, creates the Mapper, automatically inherits the Mapreducebase from the mapred package and implements the Mapper interface.
Note: This plugin automatically inherits the old version of the mapred package and the interface, the new version of the Mapper have to write their own.

Reducer.

six. Run the WordCount program in Eclipse6.1 Importing WordCount

WordCount
Import java.io.IOException;
Import Java.util.StringTokenizer;

Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {
public static class Tokenizermapper extends Mapper<longwritable, text, text, intwritable>{

Private final static intwritable one = new intwritable (1);
Private text Word = new text ();

public void Map (longwritable key, Text value, context context)
Throws IOException, Interruptedexception {
StringTokenizer ITR = new StringTokenizer (value.tostring ());
while (Itr.hasmoretokens ()) {
Word.set (Itr.nexttoken ());
Context.write (Word, one);
}
}
}

public static class Intsumreducer extends Reducer<text, intwritable, Text, intwritable> {
Private intwritable result = new intwritable ();

public void reduce (Text key, iterable<intwritable> values, context context)
Throws IOException, Interruptedexception {
int sum = 0;
for (intwritable val:values) {
Sum + = Val.get ();
}
Result.set (sum);
Context.write (key, result);
}
}

public static void Main (string[] args) throws Exception {
Configuration conf = new configuration ();
if (args.length! = 2) {
System.err.println ("Usage:wordcount");
System.exit (2);
}

Job Job = new Job (conf, "word count");
Job.setjarbyclass (Wordcount.class);
Job.setmapperclass (Tokenizermapper.class);
Job.setreducerclass (Intsumreducer.class);
Job.setmapoutputkeyclass (Text.class);
Job.setmapoutputvalueclass (Intwritable.class);
Job.setoutputkeyclass (Text.class);
Job.setoutputvalueclass (Intwritable.class);

Fileinputformat.addinputpath (Job, New Path (Args[0]));
Fileoutputformat.setoutputpath (Job, New Path (Args[1]));

System.exit (Job.waitforcompletion (true)? 0:1);

}

}

6.2 Configuring Run Parameters

Run as--Open run Dialog ... Select the WordCount program to configure the run parameters in arguments:/MAPREDUCE/WORDCOUNT/INPUT/MAPREDUCE/WORDCOUNT/OUTPUT/1

Represents the input directory and output directory under HDFs, where there are several text files in the input directory, and the output directory must not exist.

6.3 Run

Run as-run on Hadoop Select the previously configured MapReduce runtime environment and click "Finish".

The console outputs the relevant running information.

Vii. Viewing results:

In the output directory/MAPREDUCE/WORDCOUNT/OUTPUT/1, you can see the output file of the WordCount program, as well as the log, for Hadoop learning, learning to read the log is a more important ability.

Installation configuration Hadoop (with WordCount program test) under Eclipse

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Installation configuration Hadoop (with WordCount program test) under Eclipse

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Installation configuration Hadoop (with WordCount program test) under Eclipse

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support