Eclipse Remote Debugging Hadoop

Source: Internet
Author: User
Tags log4j

Environmental Requirements: System: Window Eclipse version: Mars Hadoop version: 2.6.0

Resource requirements: Hadoop-2.6.0 after decompression, the original compressed package self-Download:

  Ugly words said:

  In the following operations, the launch of Eclipse requires the right-click "Admin Run"!

  The project that created the MapReduce needed to be configured with log4j (level debug), otherwise it would not be possible to print out some debugging information so as not to find the cause of the error. Configure this log4j is very simple, you can search on the Internet, you should be able to find the relevant configuration.

1) First need to use ant to compile their own hadoop-eclipse-plugin plug-in, you can also search for their own online download, I do not like to use other people's things, so I compiled a piece, you can also refer to my other blog, learn to compile their own-"using Apache Ant compilation Hadoop2.6.0-eclipse-plugin "

2) put the compiled Hadoop plugin under plugins in the Eclipse directory and restart eclipse

3) Open the Hadoop installation directory inside the Window-->preferences-->hadoop map/reduce settings

4) Open Window-->show view to find the Map/reduce location under MapReduce tools to determine

5) Then you can see the Map/reduce location dialog at Eclipse's main interface.

6) Create a new Hadoop location, modify the main node and port of HDFs and yarn, finish.

  

7) At this point, you will see the directory structure of HDFs in the Project Explorer of Eclipse--dfs Locations

  Note: You may have permission problems when you open this directory structure (premission), This is because you do not have configuration permissions in the configuration file hdfs-site.xml of the Hadoop HDFs (the default is true, meaning that the HDFs file directory cannot be accessed by nodes outside the cluster), we need to configure this to false, restart the HDFs service, and then refresh the DFS directory above:

    <  Property >        < name >dfs.permissions.enabled</name>        <value  >false</value>    </Property  >

8) Then we create a map/reduce Project, create a wordcount program, I upload the README.txt of Hadoop to the/tmp/mrchor/directory and rename it to the readme, the output path is/tmp/mrchor/ Out

 PackageCom.mrchor.HadoopDev.hadoopDev;Importjava.io.IOException;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.Mapper;ImportOrg.apache.hadoop.mapreduce.Reducer;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat; Public classWordcountapp { Public Static voidMain (string[] args)throwsException {Configuration conf=NewConfiguration (); Job Job= Job.getinstance (conf, Wordcountapp.class. Getsimplename ()); Job.setjarbyclass (com.mrchor.HadoopDev.hadoopDev.WordCountApp.class); //todo:specify a mapperJob.setmapperclass (Mymapper.class); //todo:specify a reducerJob.setreducerclass (Myreducer.class); //todo:specify Output TypesJob.setoutputkeyclass (Text.class); Job.setoutputvalueclass (longwritable.class); //todo:specify input and output directories (not files)Fileinputformat.setinputpaths (Job,NewPath ("Hdfs://master:8020/tmp/mrchor/readme")); Fileoutputformat.setoutputpath (Job,NewPath ("Hdfs://master:8020/tmp/mrchor/out")); if(!job.waitforcompletion (true))            return; }     Public Static classMymapperextendsmapper<longwritable, text, text, longwritable>{Text K2=NewText (); Longwritable v2=Newlongwritable (); @Overrideprotected voidMap (longwritable key, text value, mapper<longwritable, text, text, longwritable>. Context context)throwsIOException, interruptedexception {string[] split= Value.tostring (). Split ("");  for(String word:split) {k2.set (word); V2.set (1);            Context.write (K2, V2); }        }    }         Public  Static classMyreducerextendsReducer<text, Longwritable, Text, longwritable>{        Longsum = 0; @Overrideprotected voidReduce (Text K2, iterable<longwritable>V2s, Reducer<text, Longwritable, Text, Longwritable> Context context)throwsIOException, interruptedexception { for(longwritable one:v2s) {sum+=One.get (); } context.write (K2,Newlongwritable (sum)); }    }    }

9) Right-click Run as-->run on Hadoop:

A) Note: This may be an error:

Java.io.IOException:HADOOP_HOME or Hadoop.home.dir is not set.

This is because you do not have the environment variables configured for Hadoop on this machine where you install eclipse, you need to configure:

A) Right button "My Computer" or "This Computer" Select Properties: Go to Advanced system settings--advanced--environment variable configuration--System variables

Create a new hadoop_home, configure the extracted Hadoop-2.6.0 directory

II) Restart Eclipse (Administrator run)

10) Continue running the WordCount program, run on Hadoop, may report the following error:

Exception in thread "main" JAVA.LANG.UNSATISFIEDLINKERROR:ORG.APACHE.HADOOP.IO.NATIVEIO.NATIVEIO$WINDOWS.ACCESS0 ( Ljava/lang/string;i) Z at Org.apache.hadoop.io.nativeio.nativeio$windows.access0 (Native Method) at Org.apache.hadoop . Io.nativeio.nativeio$windows.access (nativeio.java:557) at Org.apache.hadoop.fs.FileUtil.canRead (Fileutil.java : 977) at Org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods (diskchecker.java:187) at Org.apache.hadoop.util.DiskChecker.checkDirAccess (diskchecker.java:174) at Org.apache.hadoop.util.DiskChecker.checkDir (diskchecker.java:108) at org.apache.hadoop.fs.localdirallocator$ Allocatorpercontext.confchanged (localdirallocator.java:285) at org.apache.hadoop.fs.localdirallocator$ Allocatorpercontext.getlocalpathforwrite (localdirallocator.java:344) at Org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite (localdirallocator.java:150) at  Org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite (localdirallocator.java:131)  At Org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite (localdirallocator.java:115) at Org.apache.hadoop.mapred.LocalDistributedCacheManager.setup (localdistributedcachemanager.java:131) at Org.apache.hadoop.mapred.localjobrunner$job.<init> (localjobrunner.java:163) at Org.apache.hadoop.mapred.LocalJobRunner.submitJob (localjobrunner.java:731) at Org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal (jobsubmitter.java:536) at Org.apache.hadoop.mapreduce.job$10.run (job.java:1296) at Org.apache.hadoop.mapreduce.job$10.run (Job.java:1293) at    Java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (subject.java:422) At Org.apache.hadoop.security.UserGroupInformation.doAs (usergroupinformation.java:1628) at Org.apache.hadoop.mapreduce.Job.submit (job.java:1293) at Org.apache.hadoop.mapreduce.Job.waitForCompletion ( job.java:1314) at Com.mrchor.HadoopDev.hadoopDev.WordCountApp.main (wordcountapp.java:34)

Through the source view, found in the Nativeio.java have a description-or permissions issues, it may be necessary to add the current computer to the HDFs authorized user group:

    /**      * Checks Whether the current process have desired access rights on     * The given path.     *      * Longer term this native function can is substituted with JDK7     * function files#isreadable, iswritable, ISEXECU Table.     *     @param  Path input path     @param  desiredaccess Access_read, Access_write or Access_execute     @return  True if ACCESS is allowed     @throws  ioexception I/O exception     On Error* /

However, we have a more ingenious way to solve this problem-copy this file in the source code to your MapReduce project, this means that the program in the execution of the time when the priority to find your project under the class as a reference to the program, Instead of going to the introduction of the external jar package, look for:

11) Continue to run the WordCount program, this time the program can be executed, the result is:

If you get the result above, the program runs correctly, and the results of the MapReduce program are printed. We'll refresh the catalog and see that there are two files in the/tmp/mrchor/out directory,--_success and part-r-00000:

Instructions The program runs correctly, at this point, our Eclipse remote debugging Hadoop declared success!!! Everybody clap O (∩_∩) o

Eclipse Remote Debugging Hadoop

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.