Build Hadoop2.4.0 development environment under Eclipse
1. Install Eclipse
Download Eclipse, decompress and install it, for example, to/usr/local, that is,/usr/local/eclipse
4.3.1: http://pan.baidu.com/s/1gd29RPp
2. Install the Hadoop plug-in on eclipse
1. Download The hadoop plug-in
: Http://pan.baidu.com/s/1gd29RPp
This zip file contains the source code, we can use the compiled jar, unzip, the release folder in the hadoop. eclipse-kepler-plugin-2.2.0.jar is the compiled plug-in.
2. Put the plug-in under the eclipse/plugins directory
3. Restart eclipse and configure Hadoop installation directory
If the plug-in is successfully installed, open Windows-Preferences and the Hadoop Map/Reduce option on the left side of the window. Click this option to set the Hadoop installation path on the right side of the window.
4. Configure Map/Reduce Locations
Open Windows-Open Perspective-Other
Select Map/Reduce and click OK.
As shown in figure
Click the Map/Reduce Location tab, and click the elephant icon on the right to open the Hadoop Location Configuration window:
Enter Location Name, any Name. Configure Map/Reduce Master and DFS Mastrer, Host and Port to be consistent with the core-site.xml settings.
Click "Finish" to close the window.
Click DFSLocations-> myhadoop (location name configured in the previous step) on the left. If the user is displayed, the installation is successful.
If the installation fails, check whether Hadoop is started and whether the eclipse configuration is correct.
3. Create a WordCount Project
File-> Project, select Map/Reduce Project, and enter the Project name WordCount.
Create a class in the WordCount project named WordCount. The Code is as follows:
Import java. io. IOException;
Import java. util. StringTokenizer;
Import org. apache. hadoop. conf. Configuration;
Import org. apache. hadoop. fs. Path;
Import org. apache. hadoop. io. IntWritable;
Import org. apache. hadoop. io. Text;
Import org. apache. hadoop. mapreduce. Job;
Import org. apache. hadoop. mapreduce. Mapper;
Import org. apache. hadoop. mapreduce. Cer CER;
Import org. apache. hadoop. mapreduce. lib. input. FileInputFormat;
Import org. apache. hadoop. mapreduce. lib. output. FileOutputFormat;
Import org. apache. hadoop. util. GenericOptionsParser;
Public class WordCount {
Public static class TokenizerMapper extends Mapper <Object, Text, Text, IntWritable> {
Private final static IntWritable one = new IntWritable (1 );
Private Text word = new Text ();
Public void map (Object key, Text value, Context context) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer (value. toString ());
While (itr. hasMoreTokens ()){
Word. set (itr. nextToken ());
Context. write (word, one );
}
}
}
Public static class IntSumReducer extends Reducer <Text, IntWritable, Text, IntWritable> {
Private IntWritable result = new IntWritable ();
Public void reduce (Text key, Iterable <IntWritable> values, Context context) throws IOException, InterruptedException {
Int sum = 0;
For (IntWritable val: values ){
Sum + = val. get ();
}
Result. set (sum );
Context. write (key, result );
}
}
Public static void main (String [] args) throws Exception {
Configuration conf = new Configuration ();
String [] otherArgs = new GenericOptionsParser (conf, args). getRemainingArgs ();
If (otherArgs. length! = 2 ){
System. err. println ("Usage: wordcount <in> <out> ");
System. exit (2 );
}
Job job = new Job (conf, "word count ");
Job. setJarByClass (WordCount. class );
Job. setMapperClass (TokenizerMapper. class );
Job. setCombinerClass (IntSumReducer. class );
Job. setReducerClass (IntSumReducer. class );
Job. setOutputKeyClass (Text. class );
Job. setOutputValueClass (IntWritable. class );
FileInputFormat. addInputPath (job, new Path (otherArgs [0]);
FileOutputFormat. setOutputPath (job, new Path (otherArgs [1]);
System. exit (job. waitForCompletion (true )? 0: 1 );
}
}
Iv. Run
1. Create a directory input on HDFS
Hadoop fs-mkdir input
2 bytes faster than readme.txt to HDFS input
Hadoop fs-copyFromLocal/usr/local/hadoop/README.txt input
3. Right-click WordCount. java and choose Run As> Run Configurations to configure the running parameters, that is, the input and output folders.
Hdfs: // localhost: 9000/user/hadoop/input hdfs: // localhost: 9000/user/hadoop/output
Click the Run button to Run the program.
4. view the running result after the operation is complete.
Method 1:
Hadoop fs-ls output
There are two output results: _ SUCCESS and part-r-00000.
Run hadoop fs-cat output /*
Method 2:
Expand DFS Locations, as shown in, double-click to open the part-r00000 to view the results
Install and configure Hadoop2.2.0 on CentOS
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition
Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)