Eclipse Remote Debugging Hadoop

Last Update:2016-05-15 Source: Internet

Author: User

Tags log4j

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Environmental Requirements: System: Window Eclipse version: Mars Hadoop version: 2.6.0

Resource requirements: Hadoop-2.6.0 after decompression, the original compressed package self-Download:

　　Ugly words said:

　　In the following operations, the launch of Eclipse requires the right-click "Admin Run"!

　　The project that created the MapReduce needed to be configured with log4j (level debug), otherwise it would not be possible to print out some debugging information so as not to find the cause of the error. Configure this log4j is very simple, you can search on the Internet, you should be able to find the relevant configuration.

1) First need to use ant to compile their own hadoop-eclipse-plugin plug-in, you can also search for their own online download, I do not like to use other people's things, so I compiled a piece, you can also refer to my other blog, learn to compile their own-"using Apache Ant compilation Hadoop2.6.0-eclipse-plugin "

2) put the compiled Hadoop plugin under plugins in the Eclipse directory and restart eclipse

3) Open the Hadoop installation directory inside the Window-->preferences-->hadoop map/reduce settings

4) Open Window-->show view to find the Map/reduce location under MapReduce tools to determine

5) Then you can see the Map/reduce location dialog at Eclipse's main interface.

6) Create a new Hadoop location, modify the main node and port of HDFs and yarn, finish.

7) At this point, you will see the directory structure of HDFs in the Project Explorer of Eclipse--dfs Locations

　　Note: You may have permission problems when you open this directory structure (premission), This is because you do not have configuration permissions in the configuration file hdfs-site.xml of the Hadoop HDFs (the default is true, meaning that the HDFs file directory cannot be accessed by nodes outside the cluster), we need to configure this to false, restart the HDFs service, and then refresh the DFS directory above:

    <  Property >        < name >dfs.permissions.enabled</name>        <value  >false</value>    </Property  >

8) Then we create a map/reduce Project, create a wordcount program, I upload the README.txt of Hadoop to the/tmp/mrchor/directory and rename it to the readme, the output path is/tmp/mrchor/ Out

 PackageCom.mrchor.HadoopDev.hadoopDev;Importjava.io.IOException;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.Mapper;ImportOrg.apache.hadoop.mapreduce.Reducer;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat; Public classWordcountapp { Public Static voidMain (string[] args)throwsException {Configuration conf=NewConfiguration (); Job Job= Job.getinstance (conf, Wordcountapp.class. Getsimplename ()); Job.setjarbyclass (com.mrchor.HadoopDev.hadoopDev.WordCountApp.class); //todo:specify a mapperJob.setmapperclass (Mymapper.class); //todo:specify a reducerJob.setreducerclass (Myreducer.class); //todo:specify Output TypesJob.setoutputkeyclass (Text.class); Job.setoutputvalueclass (longwritable.class); //todo:specify input and output directories (not files)Fileinputformat.setinputpaths (Job,NewPath ("Hdfs://master:8020/tmp/mrchor/readme")); Fileoutputformat.setoutputpath (Job,NewPath ("Hdfs://master:8020/tmp/mrchor/out")); if(!job.waitforcompletion (true))            return; }     Public Static classMymapperextendsmapper<longwritable, text, text, longwritable>{Text K2=NewText (); Longwritable v2=Newlongwritable (); @Overrideprotected voidMap (longwritable key, text value, mapper<longwritable, text, text, longwritable>. Context context)throwsIOException, interruptedexception {string[] split= Value.tostring (). Split ("");  for(String word:split) {k2.set (word); V2.set (1);            Context.write (K2, V2); }        }    }         Public  Static classMyreducerextendsReducer<text, Longwritable, Text, longwritable>{        Longsum = 0; @Overrideprotected voidReduce (Text K2, iterable<longwritable>V2s, Reducer<text, Longwritable, Text, Longwritable> Context context)throwsIOException, interruptedexception { for(longwritable one:v2s) {sum+=One.get (); } context.write (K2,Newlongwritable (sum)); }    }    }

9) Right-click Run as-->run on Hadoop:

A) Note: This may be an error:

Java.io.IOException:HADOOP_HOME or Hadoop.home.dir is not set.

This is because you do not have the environment variables configured for Hadoop on this machine where you install eclipse, you need to configure:

A) Right button "My Computer" or "This Computer" Select Properties: Go to Advanced system settings--advanced--environment variable configuration--System variables

Create a new hadoop_home, configure the extracted Hadoop-2.6.0 directory

II) Restart Eclipse (Administrator run)

10) Continue running the WordCount program, run on Hadoop, may report the following error:

Exception in thread "main" JAVA.LANG.UNSATISFIEDLINKERROR:ORG.APACHE.HADOOP.IO.NATIVEIO.NATIVEIO$WINDOWS.ACCESS0 ( Ljava/lang/string;i) Z at Org.apache.hadoop.io.nativeio.nativeio$windows.access0 (Native Method) at Org.apache.hadoop . Io.nativeio.nativeio$windows.access (nativeio.java:557) at Org.apache.hadoop.fs.FileUtil.canRead (Fileutil.java : 977) at Org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods (diskchecker.java:187) at Org.apache.hadoop.util.DiskChecker.checkDirAccess (diskchecker.java:174) at Org.apache.hadoop.util.DiskChecker.checkDir (diskchecker.java:108) at org.apache.hadoop.fs.localdirallocator$ Allocatorpercontext.confchanged (localdirallocator.java:285) at org.apache.hadoop.fs.localdirallocator$ Allocatorpercontext.getlocalpathforwrite (localdirallocator.java:344) at Org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite (localdirallocator.java:150) at  Org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite (localdirallocator.java:131)  At Org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite (localdirallocator.java:115) at Org.apache.hadoop.mapred.LocalDistributedCacheManager.setup (localdistributedcachemanager.java:131) at Org.apache.hadoop.mapred.localjobrunner$job.<init> (localjobrunner.java:163) at Org.apache.hadoop.mapred.LocalJobRunner.submitJob (localjobrunner.java:731) at Org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal (jobsubmitter.java:536) at Org.apache.hadoop.mapreduce.job$10.run (job.java:1296) at Org.apache.hadoop.mapreduce.job$10.run (Job.java:1293) at    Java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (subject.java:422) At Org.apache.hadoop.security.UserGroupInformation.doAs (usergroupinformation.java:1628) at Org.apache.hadoop.mapreduce.Job.submit (job.java:1293) at Org.apache.hadoop.mapreduce.Job.waitForCompletion ( job.java:1314) at Com.mrchor.HadoopDev.hadoopDev.WordCountApp.main (wordcountapp.java:34)

Through the source view, found in the Nativeio.java have a description-or permissions issues, it may be necessary to add the current computer to the HDFs authorized user group:

    /**      * Checks Whether the current process have desired access rights on     * The given path.     *      * Longer term this native function can is substituted with JDK7     * function files#isreadable, iswritable, ISEXECU Table.     *     @param  Path input path     @param  desiredaccess Access_read, Access_write or Access_execute     @return  True if ACCESS is allowed     @throws  ioexception I/O exception     On Error* /

However, we have a more ingenious way to solve this problem-copy this file in the source code to your MapReduce project, this means that the program in the execution of the time when the priority to find your project under the class as a reference to the program, Instead of going to the introduction of the external jar package, look for:

11) Continue to run the WordCount program, this time the program can be executed, the result is:

If you get the result above, the program runs correctly, and the results of the MapReduce program are printed. We'll refresh the catalog and see that there are two files in the/tmp/mrchor/out directory,--_success and part-r-00000:

Instructions The program runs correctly, at this point, our Eclipse remote debugging Hadoop declared success!!! Everybody clap O (∩_∩) o

Eclipse Remote Debugging Hadoop

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More