Windows/linux under MyEclipse and Eclipse installation configuration Hadoop plug-in

Last Update:2015-01-31 Source: Internet

Author: User

Tags hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I recently on the windows to write a test program maxmappertemper, and then no server around, so want to configure on the Win7.

It worked. Here, write down your notes and hope it helps.

The steps to install and configure are:

Mine is MyEclipse 8.5.

Hadoop-1.2.2-eclipse-plugin.jar

1, install the Hadoop development plug-in Hadoop installation package contrib/directory has a plug-in Hadoop-1.2.2-eclipse-plugin.jar, copied to the MyEclipse root directory under the/dropins directory. 2, start MyEclipse, open Perspective: "Window", "Open Perspective", "other ...", "Map/reduce", "OK" 3, open a view: "Windows", "Show View", "Other ...", "MapReduce Tools", "Map/reduce Locations", "OK" 4, add Hadoop Location:location name: I filled in: localhost. Map/reduce Master This box host: Jobtracker is located in the cluster machine, here write Localhosthort: is the port of Jobtracker, This is written here is 9999 the two parameters are mapred-site.xml inside the mapred.job.tracker inside the IP and port DFS Master This box host: is the Namenode cluster machine, Here writes Localhostport: Is Namenode's port, here writes 8888 These two parameters is core-site.xml inside fs.default.name inside the IP and the port (use M/R Master host, This check box if selected, the default and Map/reduce master this box in the same host, if not selected, you can define the input, here Jobtracker and Namenode on a machine, so is the same, tick on) user Name: This is the username that connects to Hadoop because I installed Hadoop with LSQ users and didn't create other users, so I used lsq. The following is not required. Then click the Finish button, at which point there is more than one record in this view. Restart MyEclipse and re-edit the connection record that you just created, and now we edit Advance Parameters tab page (restart Edit Advance Parameters tab reason: This advance when creating a new connection Some properties of the Paramters tab page are not displayed and are not displayedCan not be set up, so you have to restart Eclipse and then come in to edit to see) Most of the properties here have been automatically filled in, in fact, the Core-defaulte.xml, Hdfs-defaulte.xml, Some of the configuration properties inside the Mapred-defaulte.xml are shown. Because there are changes in the site family configuration file when installing Hadoop, it is also necessary to make the same settings here. The main concerns are the following properties: Fs.defualt.name: This has been set on the General tab page Mapred.job.tracker: This in general tab page also set the dfs.replication: This is the default is 3, because I hdfs-site.xml inside set to 1, so here also set to 1 Hadoop.job.ugi: here to fill in: Lsq,tardis, The comma is preceded by the user of the connected Hadoop, and after the comma is written dead Tardis (this property does not know how I did not ...) Then click Finish, then connect (start the sshd service, start the Hadoop process), connect the flag 5, New Map/reduce project: "File", "New", "Project ..." Map/reduce "Map/reduce Project", "Project Name:wordcount", "Configure Hadoop install directory ..." Installation directory:d:\cygwin\home\lsq\hadoop-0.20.2, "OK", "Next", "Allow output" Folders for source Folders "and" Finish "

6. New WordCount class: Add/write source code: d:\cygwin\home\lsq\hadoop-1.2.2/src/examples/org/apache/hadoop/examples/ Wordcount.javapackage Org.apache.hadoop.examples;import Java.io.ioexception;import Java.util.StringTokenizer; Import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.path;import Org.apache.hadoop.io.intwritable;import Org.apache.hadoop.io.text;import Org.apache.hadoop.mapreduce.job;import Org.apache.hadoop.mapreduce.mapper;import Org.apache.hadoop.mapreduce.reducer;import Org.apache.hadoop.mapreduce.lib.input.fileinputformat;import Org.apache.hadoop.mapreduce.lib.output.fileoutputformat;import Org.apache.hadoop.util.GenericOptionsParser; public class WordCount {public static class Tokenizermapper extends Mapper<object, text, text, intwritable>{p    Rivate final static intwritable one = new intwritable (1); Private text Word = new text ();p ublic void map (Object key, Text value, Context context) throws Ioexcep tion, interruptedexception {stringtOkenizer ITR = new StringTokenizer (value.tostring ());        while (Itr.hasmoretokens ()) {Word.set (Itr.nexttoken ());      Context.write (Word, one); }}}public static class Intsumreducer extends Reducer<text,intwritable,text,intwritable> {private in                        twritable result = new Intwritable ();p ublic void reduce (Text key, iterable<intwritable> values,      Context context) throws IOException, interruptedexception {int sum = 0;      for (intwritable val:values) {sum + = Val.get ();      } result.set (sum);    Context.write (key, result);    }}public static void Main (string[] args) throws Exception {Configuration conf = new configuration ();    string[] Otherargs = new Genericoptionsparser (conf, args). Getremainingargs ();      if (otherargs.length! = 2) {System.err.println ("Usage:wordcount <in> <out>");    System.exit (2);    Job Job = new Job (conf, "word count"); Job.Setjarbyclass (Wordcount.class);    Job.setmapperclass (Tokenizermapper.class);    Job.setcombinerclass (Intsumreducer.class);    Job.setreducerclass (Intsumreducer.class);    Job.setoutputkeyclass (Text.class);    Job.setoutputvalueclass (Intwritable.class);    Fileinputformat.addinputpath (Job, New Path (Otherargs[0]));    Fileoutputformat.setoutputpath (Job, New Path (Otherargs[1]));  System.exit (Job.waitforcompletion (true)? 0:1); }}7, upload the simulation Data folder. In order to run the program, you need an input folder and an output folder. The output folder is automatically generated when the program is finished running. We need to give the program an input folder. (1), in the current directory (such as the Hadoop installation directory) to create a new folder input, and the folder under the new two files file1, file2, the two files are as follows: File1 Hello world Bye World file2 Hello Hadoop  Goodbye Hadoop (2),. Upload the folder input to the Distributed file system. After you have started the Hadoop daemon terminal in the CD to Hadoop installation directory, run the following command: Bin/hadoop fs-put input in 8, configure run Parameters: ① in the New Project WordCount, click Wordcount.java, right--& Gt Run As-->run Configurations② in the pop-up run Configurations dialog, point Java Application, right-click-->new, This will create a new application named Wordcount③ Configuration Run parameters, point arguments, in program arguments enter "you want to pass to the application of the input folder and you require the program to save the results of the folder", such as: (If run times java.)Lang. Outofmemoryerror:java Heap Space Configuration VM arguments (under program arguments)-xms512m-xmx1024m-xx:maxpermsize=256m) 8, click Run, Run the program click Run, run the program, after a period of time will run to complete, and so on after the end of the run, you can use the command in the terminal to see whether to generate a folder Output:bin/hadoop fs-ls the following command to view the contents of the generated file: Bin/hadoop fs-cat output/* if shown below, the first MapReduce program has been successfully run under MyEclipse. Bye 1 Goodbye 1 Hadoop 2 Hello 2 World 2

Windows/linux under MyEclipse and Eclipse installation configuration Hadoop plug-in

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More