Use Eclipse to build hadoop2.4 execution environment under Ubuntu system

Source: Internet
Author: User
Tags static class hadoop fs

When using Hadoop for MapReduce programming. We all want to use the IDE for development. This article mainly describes how to use Eclipse for Hadoop programming.

Assuming your cluster is not ready, you can refer to my previous article using hadoop2.4 to build a cluster (pseudo-distributed)

First, install Eclipse

Method one: Download directly in the Ubuntu Software Center, for example, as seen in.


Method Two: After downloading the Eclispe compressed file, use the command to install. :Http://pan.baidu.com/s/1mgiHFok

      sudo tar-zxvf eclipse-dsl-juno-sr1-linux-gtk.tar.gz

So eclipse is installed.


Second, installation Hadoop-eclipse-plugin

Download Hadoop2x-eclipse-plugin, the Hadoop-eclipse-kepler-plugin-2.2.0.jar in release (although the callout is 2.2.0. But it's no problem under the 2.4.1. Should be able to be under the 2.x version number, but sometimes it is suggested that some things expire, for learners. I personally think that temporary can be copied into the plugin folder of the Eclipse installation folder without considering these details. You can do this by ./eclipse -clean restarting Eclipse. The default installation folder for Eclipse is:/usr/lib/eclipse: Use two to install Eclipse the folder depends on itself. My folder is/usr/local/eclipse.

./eclipse -cleaThe n command needs to be in the installation folder of Eclipse. After opening, you can see the file system.


The plugin needs to be further configured.

First step: Select Preference under the Window menu and pop up a window with the Hadoop map/reduce option on the left side of the window and click this option to select the installation folder for Hadoop (such as/usr/local/hadoop).


The second step: switch map/reduce working folder, choose Open Perspective under the Window menu, pop up a Windows, select the Map/reduce option to switch.


In the third step, click on the Map/reduce Location tab and click on the icon on the right to open the Hadoop location configuration form:


Enter location Name. Random name can be. Configuring Map/reduce Master and DFS Mastrer,host and port are configured to match the settings of the Core-site.xml.



Click "Finish" button. Close the form.

Click on the left side of the Dfslocations->mapreduceproject (location name in the previous step), if you can see the user, indicating a successful installation

Assume this error occurs error:call from mylinux/127.0.1.1 to localhost:9090 failed on connection exception java. Connection.net.ConnectException refused to connect.

First make sure that Hadoop has no boot .

I was also because I did not start Hadoop, and then took a long time to discover the problem, I hope to help you.

For a few more reasons to be able to participate in one of my blog in Ubuntu using Eclispe to connect to Hadoop when the link resolution summary

Third, the new WordCount sample

file- > Project , select Map/reduce Project, enter the name WordCount, and so on.

     Create a new class in the WordCount project with the name WordCount, code such as the following:

Import Java.io.ioexception;import Java.util.stringtokenizer;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.path;import Org.apache.hadoop.io.intwritable;import Org.apache.hadoop.io.text;import Org.apache.hadoop.mapreduce.job;import Org.apache.hadoop.mapreduce.mapper;import Org.apache.hadoop.mapreduce.reducer;import Org.apache.hadoop.mapreduce.lib.input.fileinputformat;import Org.apache.hadoop.mapreduce.lib.output.fileoutputformat;import Org.apache.hadoop.util.GenericOptionsParser; public class WordCount {public static class Tokenizermapper extends Mapper<object, text, text, intwritable>{Priva  Te final static intwritable one = new intwritable (1);  Private text Word = new text ();  public void Map (Object key, Text value, Context context) throws IOException, interruptedexception {stringtokenizer ITR      = New StringTokenizer (value.tostring ());        while (Itr.hasmoretokens ()) {Word.set (Itr.nexttoken ());     Context.write (Word, one); }}}public Static Class Intsumreducer extends Reducer<text,intwritable,text,intwritable> {private intwritable result = new I   Ntwritable (); public void reduce (Text key, iterable<intwritable> Values,context Context) throws IOException,    interruptedexception {int sum = 0;    for (intwritable val:values) {sum + = Val.get ();    } result.set (sum);  Context.write (key, result);  }}public static void Main (string[] args) throws Exception {configuration conf = new Configuration ();  string[] Otherargs = new Genericoptionsparser (conf, args). Getremainingargs ();    if (otherargs.length! = 2) {System.err.println ("Usage:wordcount <in> <out>");  System.exit (2);  Job Job = new Job (conf, "word count");  Job.setjarbyclass (Wordcount.class);  Job.setmapperclass (Tokenizermapper.class);  Job.setcombinerclass (Intsumreducer.class);  Job.setreducerclass (Intsumreducer.class);  Job.setoutputkeyclass (Text.class);  Job.setoutputvalueclass (Intwritable.class); Fileinputformat.Addinputpath (Job, New Path (Otherargs[0]));  Fileoutputformat.setoutputpath (Job, New Path (Otherargs[1])); System.exit (Job.waitforcompletion (true)? 0:1);}}

1. Create a folder on HDFs input

    Hadoop Fs-mkdir Input  

This is created by using commands. We were able to create a right-click Hadoop (depending on the individual configuration) in Eclipse


2. Copy the local README.txt to the input in HDFs

Hadoop fs-copyfromlocal/usr/local/hadoop/readme.txt Input

The same we can right-click input and then select Upload file, use the form of visualization to upload files.

3, click Wordcount.java, right button. Click Run As->run configurations to configure the execution parameters, that is, the input and output directories

Of course, we can also write the path directly in the code, really understand the file system you will find that there are a lot of methods, just need to change the Java code.

The following configuration is corresponding to the code inside this

string[] Otherargs = new Genericoptionsparser (conf, args). Getremainingargs ();.

Hdfs://localhost:9000/user/hadoop/input Hdfs://localhost:9000/user/hadoop/output


4, after the completion of the implementation. View execution Results

The first approach is to use the command line directly in the terminal to view it.

  Hadoop fs-ls Output

can see that there are two output results. _success and part-r-00000

Run

Hadoop Fs-cat output/*

Another way is to view it directly in Eclipse. First remember to refresh the file system ~

Expand Dfs Locations. For example, as seen in the. Double-click Open part-r00000 View Results




Use Eclipse to build hadoop2.4 execution environment under Ubuntu system

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.