Here I'll show you how to configure Hadoop under Windows Setup. and test a program
The plug-ins you need to use and the respective:
First, the preparatory work
1, eclipse, preferably Java EE version, thought can change the mode.
2. Connectors for Hadoop and eclipse:
Hadoop-eclipse-plugin-1.2.1.jar (This is what I use, here you can customize the selection version)
3, is the Hadoop source package (Download the latest can).
Copy the Hadoop-0.20.2-eclipse-plugin.jar to the Eclipse/plugins directory and restart Eclipse.
two. Open the MapReduce view
Window---Open perspective, other select Map/reduce, the icon is a blue elephant.
Click the entire interface to change to MapReduce mode.
three. Add a MapReduce environment
At the lower end of eclipse, there will be a tab next to the console, called "Map/reduce Locations", right-click in the blanks below, select "New Hadoop location ...",:
In the dialog box that pops up, fill in the following:
Location name (take a name)
Map/reduce Master (the IP and port of the Job tracker, according to the mapred.job.tracker configured in Mapred-site.xml)
DFS Master (Name node IP and port, based on fs.default.name configured in core-site.xml)
four. Use Eclipse to modify HDFs content
After the previous step, the left "Project Explorer" should appear in the configuration of HDFs, right-click, you can create new folders, delete folders, upload files, download files, delete files and other operations.
Note: The changes cannot be displayed immediately after each operation in eclipse and must be refreshed.
Five. Create a MapReduce project5.1 Configuring the Hadoop path
Window---Preferences Select "Hadoop map/reduce" and click "Browse ..." to select the path to the Hadoop folder.
This step is independent of the operating environment, but it is possible to automatically import all the jar packages from the Hadoop root and the Lib directory when the new project is created.
5.2 Creating a project
Project, File, New, select Map/reduce Project, and then enter the project name to create the project. The plugin automatically imports all the jar packages from the Hadoop root and the Lib directory.
5.3 Creating mapper or reducer
Mapper, File, New, creates the Mapper, automatically inherits the Mapreducebase from the mapred package and implements the Mapper interface.
Note: This plugin automatically inherits the old version of the mapred package and the interface, the new version of the Mapper have to write their own.
Reducer.
six. Run the WordCount program in Eclipse6.1 Importing WordCount
WordCount
Import java.io.IOException;
Import Java.util.StringTokenizer;
Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class Tokenizermapper extends Mapper<longwritable, text, text, intwritable>{
Private final static intwritable one = new intwritable (1);
Private text Word = new text ();
public void Map (longwritable key, Text value, context context)
Throws IOException, Interruptedexception {
StringTokenizer ITR = new StringTokenizer (value.tostring ());
while (Itr.hasmoretokens ()) {
Word.set (Itr.nexttoken ());
Context.write (Word, one);
}
}
}
public static class Intsumreducer extends Reducer<text, intwritable, Text, intwritable> {
Private intwritable result = new intwritable ();
public void reduce (Text key, iterable<intwritable> values, context context)
Throws IOException, Interruptedexception {
int sum = 0;
for (intwritable val:values) {
Sum + = Val.get ();
}
Result.set (sum);
Context.write (key, result);
}
}
public static void Main (string[] args) throws Exception {
Configuration conf = new configuration ();
if (args.length! = 2) {
System.err.println ("Usage:wordcount");
System.exit (2);
}
Job Job = new Job (conf, "word count");
Job.setjarbyclass (Wordcount.class);
Job.setmapperclass (Tokenizermapper.class);
Job.setreducerclass (Intsumreducer.class);
Job.setmapoutputkeyclass (Text.class);
Job.setmapoutputvalueclass (Intwritable.class);
Job.setoutputkeyclass (Text.class);
Job.setoutputvalueclass (Intwritable.class);
Fileinputformat.addinputpath (Job, New Path (Args[0]));
Fileoutputformat.setoutputpath (Job, New Path (Args[1]));
System.exit (Job.waitforcompletion (true)? 0:1);
}
}
6.2 Configuring Run Parameters
Run as--Open run Dialog ... Select the WordCount program to configure the run parameters in arguments:/MAPREDUCE/WORDCOUNT/INPUT/MAPREDUCE/WORDCOUNT/OUTPUT/1
Represents the input directory and output directory under HDFs, where there are several text files in the input directory, and the output directory must not exist.
6.3 Run
Run as-run on Hadoop Select the previously configured MapReduce runtime environment and click "Finish".
The console outputs the relevant running information.
Vii. Viewing results:
In the output directory/MAPREDUCE/WORDCOUNT/OUTPUT/1, you can see the output file of the WordCount program, as well as the log, for Hadoop learning, learning to read the log is a more important ability.
Installation configuration Hadoop (with WordCount program test) under Eclipse