Installation configuration Hadoop (with WordCount program test) under Eclipse

Source: Internet
Author: User

Here I'll show you how to configure Hadoop under Windows Setup. and test a program

The plug-ins you need to use and the respective:

First, the preparatory work


1, eclipse, preferably Java EE version, thought can change the mode.

2. Connectors for Hadoop and eclipse:

Hadoop-eclipse-plugin-1.2.1.jar (This is what I use, here you can customize the selection version)

3, is the Hadoop source package (Download the latest can).


Copy the Hadoop-0.20.2-eclipse-plugin.jar to the Eclipse/plugins directory and restart Eclipse.


two. Open the MapReduce view

Window---Open perspective, other select Map/reduce, the icon is a blue elephant.








Click the entire interface to change to MapReduce mode.

three. Add a MapReduce environment

At the lower end of eclipse, there will be a tab next to the console, called "Map/reduce Locations", right-click in the blanks below, select "New Hadoop location ...",:





In the dialog box that pops up, fill in the following:

Location name (take a name)
Map/reduce Master (the IP and port of the Job tracker, according to the mapred.job.tracker configured in Mapred-site.xml)
DFS Master (Name node IP and port, based on fs.default.name configured in core-site.xml)





four. Use Eclipse to modify HDFs content

After the previous step, the left "Project Explorer" should appear in the configuration of HDFs, right-click, you can create new folders, delete folders, upload files, download files, delete files and other operations.

Note: The changes cannot be displayed immediately after each operation in eclipse and must be refreshed.


Five. Create a MapReduce project5.1 Configuring the Hadoop path

Window---Preferences Select "Hadoop map/reduce" and click "Browse ..." to select the path to the Hadoop folder.
This step is independent of the operating environment, but it is possible to automatically import all the jar packages from the Hadoop root and the Lib directory when the new project is created.

5.2 Creating a project

Project, File, New, select Map/reduce Project, and then enter the project name to create the project. The plugin automatically imports all the jar packages from the Hadoop root and the Lib directory.

5.3 Creating mapper or reducer

Mapper, File, New, creates the Mapper, automatically inherits the Mapreducebase from the mapred package and implements the Mapper interface.
Note: This plugin automatically inherits the old version of the mapred package and the interface, the new version of the Mapper have to write their own.

Reducer.

six. Run the WordCount program in Eclipse6.1 Importing WordCount



WordCount
Import java.io.IOException;
Import Java.util.StringTokenizer;

Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {
public static class Tokenizermapper extends Mapper<longwritable, text, text, intwritable>{

Private final static intwritable one = new intwritable (1);
Private text Word = new text ();

public void Map (longwritable key, Text value, context context)
Throws IOException, Interruptedexception {
StringTokenizer ITR = new StringTokenizer (value.tostring ());
while (Itr.hasmoretokens ()) {
Word.set (Itr.nexttoken ());
Context.write (Word, one);
}
}
}

public static class Intsumreducer extends Reducer<text, intwritable, Text, intwritable> {
Private intwritable result = new intwritable ();

public void reduce (Text key, iterable<intwritable> values, context context)
Throws IOException, Interruptedexception {
int sum = 0;
for (intwritable val:values) {
Sum + = Val.get ();
}
Result.set (sum);
Context.write (key, result);
}
}

public static void Main (string[] args) throws Exception {
Configuration conf = new configuration ();
if (args.length! = 2) {
System.err.println ("Usage:wordcount");
System.exit (2);
}

Job Job = new Job (conf, "word count");
Job.setjarbyclass (Wordcount.class);
Job.setmapperclass (Tokenizermapper.class);
Job.setreducerclass (Intsumreducer.class);
Job.setmapoutputkeyclass (Text.class);
Job.setmapoutputvalueclass (Intwritable.class);
Job.setoutputkeyclass (Text.class);
Job.setoutputvalueclass (Intwritable.class);

Fileinputformat.addinputpath (Job, New Path (Args[0]));
Fileoutputformat.setoutputpath (Job, New Path (Args[1]));

System.exit (Job.waitforcompletion (true)? 0:1);

}

}


6.2 Configuring Run Parameters

Run as--Open run Dialog ... Select the WordCount program to configure the run parameters in arguments:/MAPREDUCE/WORDCOUNT/INPUT/MAPREDUCE/WORDCOUNT/OUTPUT/1

Represents the input directory and output directory under HDFs, where there are several text files in the input directory, and the output directory must not exist.

6.3 Run

Run as-run on Hadoop Select the previously configured MapReduce runtime environment and click "Finish".

The console outputs the relevant running information.



Vii. Viewing results:


In the output directory/MAPREDUCE/WORDCOUNT/OUTPUT/1, you can see the output file of the WordCount program, as well as the log, for Hadoop learning, learning to read the log is a more important ability.




Installation configuration Hadoop (with WordCount program test) under Eclipse

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.