Compile the hadoop 2.x Hadoop-eclipse-plugin plug-in windows and use eclipsehadoop

Last Update:2014-12-24 Source: Internet

Author: User

Tags hdfs dfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Compile the hadoop 2.x Hadoop-eclipse-plugin plug-in windows and use eclipsehadoop
I. Introduction

Without the Eclipse plug-in tool after Hadoop2.x, we cannot debug the code on Eclipse. We need to package MapReduce of the written java code into a jar and then run it on Linux, therefore, it is inconvenient for us to debug the code. Therefore, we compile an Eclipse plug-in so that we can debug it locally. After the development of hadoop1.x, it is much easier to compile the eclipse plug-in of hadoop2.x. Next, we will compile the Hadoop-eclipse-plugin plug-in and develop Hadoop in Eclipse.

2. install and configure the software

1. JDK Configuration

1) install jdk

2) Configure Environment Variables

JAVA_HOME, CLASSPATH, PATH, and other settings are not described here.

2. Eclipse

12.16.download eclipse-jee-juno-sr2.rar

2) decompress the package to the local disk ,:

3. Ant

1) download

Http://ant.apache.org/bindownload.cgi

Apache-ant-1.9.4-bin.zip

2) decompress the package to a disk ,:

3). environment variable configuration

Create ANT_HOME = E: \ ant \ apache-ant-1.9.4-bin \ apache-ant-1.9.4

Add % ANT_HOME % \ bin after PATH

4) Run cmd to check whether the configuration is correct.

Ant version:

4. Hadoop

1) download the hadoop package

Hadoop-2.6.0.tar.gz

Decompress the package to a local disk ,:

Download the hadoop2x-eclipse-plugin source code

1) currently, the eclipse-plugins source code of hadoop2 is out of control by github, Which is https://github.com/winghc/hadoop2x-eclipse-plugin. then, click Download ZIP link on the right side to download it ,:

2. Download hadoop2x-eclipse-plugin-master.zip

Decompress the package to a local disk ,:

3. Compile the hadoop-eclipse-plugin plug-in

1. hadoop2x-eclipse-plugin-master unzip in E: disk open command line cmd, switch to E: \ hadoop \ hadoop2x-eclipse-plugin-master \ src \ contrib \ eclipse-plugin directory ,:

2. Execute ant jar

Antjar-Dversion = 2.6.0-Declipse. home = F: \ tool \ eclipse-jee-juno-SR2 \ eclipse-jee-juno-SR2-Dhadoop. home = E: \ hadoop \ hadoop-2.6.0 \ hadoop-2.6.0 ,:

3. The successfully compiled hadoop-eclipse-plugin-2.6.0.jar is in the E: \ hadoop \ hadoop2x-eclipse-plugin-master \ build \ contrib \ eclipse-plugin path ,:

4. Eclipse configure hadoop-eclipse-plugin plug-in 1. Copy the hadoop-eclipse-plugin-2.6.0.jar to the F: \ tool \ eclipse-jee-juno-SR2 \ eclipse-jee-juno-SR2 \ plugins directory, restart Eclipse, then you can see DFS Locations ,:

2. open Window --> Preferens, you can see the Hadoop Map/Reduc option, then click, and then add the hadoop-2.6.0 in ,:

3. Configure Map/ReduceLocations

1) Click Window --> Show View --> MapReduce Tools and click Map/ReduceLocation.

2) Click the Map/ReduceLocation tab, and click the elephant icon on the right to open the Hadoop Location Configuration window: Enter Location Name, any Name. configure Map/Reduce Master and DFS Mastrer, Host and Port to configure the hdfs-site.xml to be consistent with the core-site.xml settings.

4. Check whether the connection is successful.

5. Run and create a WordCount Project

1. Right-click New-> Map/Reduce Project

2. Create WordCount. java

import java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount {  public static class TokenizerMapper       extends Mapper<Object, Text, Text, IntWritable>{    private final static IntWritable one = new IntWritable(1);    private Text word = new Text();    public void map(Object key, Text value, Context context                    ) throws IOException, InterruptedException {      StringTokenizer itr = new StringTokenizer(value.toString());      while (itr.hasMoreTokens()) {        word.set(itr.nextToken());        context.write(word, one);      }    }  }  public static class IntSumReducer       extends Reducer<Text,IntWritable,Text,IntWritable> {    private IntWritable result = new IntWritable();    public void reduce(Text key, Iterable<IntWritable> values,                       Context context                       ) throws IOException, InterruptedException {      int sum = 0;      for (IntWritable val : values) {        sum += val.get();      }      result.set(sum);      context.write(key, result);    }  }  public static void main(String[] args) throws Exception {    Configuration conf = new Configuration();    Job job = Job.getInstance(conf, "word count");    job.setJarByClass(WordCount.class);    job.setMapperClass(TokenizerMapper.class);    job.setCombinerClass(IntSumReducer.class);    job.setReducerClass(IntSumReducer.class);    job.setOutputKeyClass(Text.class);    job.setOutputValueClass(IntWritable.class);    FileInputFormat.addInputPath(job, new Path(args[0]));    FileOutputFormat.setOutputPath(job, new Path(args[1]));    System.exit(job.waitForCompletion(true) ? 0 : 1);  }}

3. create the required text in the hdfs input directory

1) no input/output directory card. Create a folder on hdfs first.

# Bin/hdfs dfs-mkdir-p/user/root/input

# Bin/hdfs dfs-mkdir-p/user/root/output

2) Upload the text to be counted to the hdfs input directory.

# Bin/hdfs dfs-put/usr/local/hadoop/hadoop-2.6.0/test/*/user/root/input // upload the tes/file01 file to hdfs/user/root /input

3). View

# Bin/hdfs dfs-cat/user/root/input/file01

4. Click WordCount. java and right-click --> Run As --> Run deployments to set the input and output directory paths ,:

5. Click WordCount. java and right-click --> Run As --> Run on Hadoop

In the output/count directory, there is a statistical file and the result is displayed. Therefore, the configuration is successful.

5. Notes

In this article, we introduced how to connect Eclipse to Hadoop on a Linux Virtual Machine and develop Hadoop in Eclipse to solve exceptions: org. apache. hadoop. io. nativeio. nativeIO $ Windows. access0 (Ljava/lang/String; I) Z and other issues

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More