Windows Eclipse Remote Connection Hadoop cluster development MapReduce

Source: Internet
Author: User
Tags hadoop fs

Reprint Please indicate the source, thank you 2017-10-22 17:14:09Before the development of the Maprduce program in Python, we tried to build the development environment before development by using Eclipse Java Development under Windows today. Here, summarize this process and hope to help friends in need. With Hadoop Eclipse plugin, you can browse the management HDFs and automatically create a template file for the Mr Program, and the best thing you can do is run on Hadoop directly. 1, install the plug-inDownload the Hadoop-eclipse-plugin-1.2.1.jar and put it in the F:\eclipse\plugins directory. 2, plug-in configuration and use 2.1 Specifying the source directory for Hadoop 2.2. Open Map/reduce View"Window", "Open Perspective", "other", "Map/reduce". " Window "Show views", "other", "Map Reduce Tools", "Map/reduce locations".

The HDFS flag in the upper-left corner appears next to normal, and when eclipse is connected to the Hadoop cluster, the HDFS directory structure is displayed. 2.3, new map/reduce localtionClick on the red box in the picture or right click to select New, then the following screen appears, configure the Hadoop cluster information. It is important to note that the Hadoop cluster information is filled in. Because I was developing the Hadoop cluster "fully distributed" using Eclipse Remote Connection under Windows, the host here is the IP address of master. If Hadoop is pseudo-distributed, localhost can be filled in. "Jser name" fill in the user name of the Windows computer, right-click on "My Computer"-"manage"-"Local Users and Groups"-"Modify user name" After completing the previous steps, the normal eclipse interface should be like that. Note that the EXAMPLE1 project was built on my own, primarily to verify that eclipse can remotely connect to a Hadoop cluster to develop a mapreduce program. Also, the operation of HDFs in the HDFs view of Eclipse (additions and deletions) is the same as the result of the HDFs operation on the command line. 3. Develop MapReduce program 3.1. New MapReduce Project The benefits of using plug-in development are now shown, to complete this step, a MapReduce project template will appear in the project view, without our own import of the Hadoop jar package. The red box is the empty template generated after the new MapReduce project, and all we need to do is create a new package and develop the Java program in the SRC folder. 3.3. Uploading files via command line in the remote terminal Hadoop fs-put test.txt/input/or through the HDFs view of Eclipse upload input file:/input/test.txt, the content is as follows:
Liang Ni hao mawo hen haohaqweasasaxcxc vbv xxxx aaa eee
3.2. Wordcount.java procedure
 PackageCom.hadoop.example1;Importjava.io.IOException;ImportJava.util.Iterator;ImportJava.util.StringTokenizer;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapred.FileInputFormat;ImportOrg.apache.hadoop.mapred.FileOutputFormat;Importorg.apache.hadoop.mapred.JobClient;Importorg.apache.hadoop.mapred.JobConf;Importorg.apache.hadoop.mapred.MapReduceBase;ImportOrg.apache.hadoop.mapred.Mapper;ImportOrg.apache.hadoop.mapred.OutputCollector;ImportOrg.apache.hadoop.mapred.Reducer;ImportOrg.apache.hadoop.mapred.Reporter;ImportOrg.apache.hadoop.mapred.TextInputFormat;ImportOrg.apache.hadoop.mapred.TextOutputFormat; Public classWordCount { Public Static classMapextendsMapreducebaseImplementsMapper<longwritable, text, text, intwritable> {        Private Final StaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText ();  Public voidmap (longwritable key, Text value, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {String line=value.tostring (); StringTokenizer Tokenizer=NewStringTokenizer (line);  while(Tokenizer.hasmoretokens ()) {Word.set (Tokenizer.nexttoken ());            Output.collect (Word, one); }        }    }     Public Static classReduceextendsMapreducebaseImplementsReducer<text, Intwritable, Text, intwritable> {         Public voidReduce (Text key, iterator<intwritable>values, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {intsum = 0;  while(Values.hasnext ()) {sum+=Values.next (). get (); } output.collect (Key,Newintwritable (sum)); }    }     Public Static voidMain (string[] args)throwsException {jobconf conf=NewJobconf (WordCount.class); Conf.setjobname ("WordCount"); Conf.setoutputkeyclass (Text.class); Conf.setoutputvalueclass (intwritable.class); Conf.setmapperclass (Map.class); Conf.setcombinerclass (Reduce.class); Conf.setreducerclass (Reduce.class); Conf.setinputformat (Textinputformat.class); Conf.setoutputformat (Textoutputformat.class); Fileinputformat.setinputpaths (conf,NewPath (args[0])); Fileoutputformat.setoutputpath (conf,NewPath (args[1]));    Jobclient.runjob (conf); }}

3.3. Running EXAMPLSE1 ProjectNote that this type of development is run using:run on HaoopHow to run: "Right-click Project"-"Run as"-"run on Hadoop". If you jump out of an interface to choose from, it proves that the Java applicaltion is not right for the project. You can do this at this point: right-click the project--run as and run on Configrations. and fill in the pass parameters are input file path and output directory path.

developed on Linux Eclipse, the program will work correctly if the above steps are successful. However, the following errors were developed under Windows Eclipse. Because Windows file permissions are checked in Hadoop source code, we want to modify the Hadoop source code.
14/05/29 13:49:16 WARN util. Nativecodeloader:unable to loadnative-hadoop Library forYour platform ... using builtin-Java classes where applicable14/05/29 13:49:16 ERROR Security. Usergroupinformation:priviledgedactionexception As:iscas cause:java.io.IOException:Failed To set permissions of path: \tmp\hadoop-iscas\mapred\staging\iscas1655603947\.staging to 0700Exception in Thread"Main" java.io.IOException:Failed to set permissions of path: \tmp\hadoop-iscas\mapred\staging\iscas1655603947\. Staging to 0700At Org.apache.hadoop.fs.FileUtil.checkReturnValue (Fileutil.java:691) at Org.apache.hadoop.fs.FileUtil.setPermission (Fileutil.java:664) at Org.apache.hadoop.fs.RawLocalFileSystem.setPermission (Rawlocalfilesystem.java:514) at Org.apache.hadoop.fs.RawLocalFileSystem.mkdirs (Rawlocalfilesystem.java:349) at Org.apache.hadoop.fs.FilterFileSystem.mkdirs (Filterfilesystem.java:193) at Org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir (Jobsubmissionfiles.java:126) at org.apache.hadoop.mapred.jobclient$2.run (jobclient.java:942) at org.apache.hadoop.mapred.jobclient$2.run (jobclient.java:936) at java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (Unknown Source) at Org.apache.hadoop.security.UserGroupInformation.doAs (Usergroupinformation.java:1190) at Org.apache.hadoop.mapred.JobClient.submitJobInternal (Jobclient.java:936) at Org.apache.hadoop.mapreduce.Job.submit (Job.java:550) at Org.apache.hadoop.mapreduce.Job.waitForCompletion (Job.java:580) at Org.apache.hadoop.examples.WordCount.main (Wordcount.java:82)
3.4, modify the Hadoop source code to support the Windows Eclipse Development MapReduce program. The code that is having the problem is located in "Hadoop-1.2.1\src\core\org\apache\hadoop\fs\fileutil.java". Modify the method as follows, comment out the judgment of the file permissions.
Private Static void Checkreturnvalue (boolean  RV, File p,fspermission permission)throws  ioexception{    /**    * Comment The following, disable this function     if (!RV)    {        throw new IOException ("Failed to set permissions of path:" + P +        "to" +        String.Format ("%04o", Permission.tosh ORT ()));    }     */ }
Then recompile the modified file, package the. class file into the Hadoop-core-1.2.1.jar, and refresh the project again. Here, in order to facilitate everyone, I provide the modified jar file package, if necessary can click to download, and replace the original hadoop-1.2.1 in the jar package, located in the hadoop-1.2.1 root directory. 3,3 the steps again, and then the operation succeeds. 3.5 Viewing resultsAfter the HDFs view is refreshed, you can see the build Output_wordcount folder, and the resulting part-00000 can be seen in this directory, with the result:

Windows Eclipse Remote Connection Hadoop cluster development MapReduce

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.