Eclipse Configuration hadoop2.7.2 Development environment

Last Update:2017-05-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First install and start Hadoop, how to see the above http://www.cnblogs.com/wuxun1997/p/6847950.html. Here's how to set up the IDE to develop Hadoop code. First make sure you install Eclipse locally, and then the next Eclipse Hadoop plugin is done. Here's a look at:

1. To http://download.csdn.net/detail/wuxun1997/9841487 download the Eclipse plugin and drop it into Eclipse's Pulgin directory and restart the Eclipse,project Explorer appears Dfs Locations;

2, click window-> Point preferences-> Point Hadoop map/reduce-> fill D:\hadoop-2.7.2 and OK;

3, click window-> Point Show view-> point MapReduce tools Map/reduce locations-> point to the right corner of a small elephant with + sign "New Hadoop location" Eclipse has filled out the default parameters, but the following parameters need to be modified, see the two configuration files Core-site.xml and Hdfs-site.xml above:

General->map/reduce (V2) master->port change to 9001

GENERAL->DSF Master->port Change to 9000

Advanced Paramters->dfs.datanode.data.dir to Ffile:/hadoop/data/dfs/datanode

Advanced Paramters->dfs.namenode.name.dir to File:/hadoop/data/dfs/namenode

4, click Finish in DFS locations right click on the left Triangle icon, appear hdsf folder, you can directly operate here HDSF, right click on the file icon select "Create new Dictionery" can be added, Right click on the folder icon again reflesh the new results appear, at this time in Localhost:50070->utilities->browse the file system can also see the new results;

5. New Hadoop project: File->new->project->map/reduce project->next-> Enter your own project name, such as Hadoop, and click Finish.

6, the code here shows the most common examples of participle, the statistics of Chinese novels and the names in descending order. To import a jar for participles, download the http://download.csdn.net/detail/wuxun1997/9841659 here. The project structure is as follows:

Hadoop

|--src

|--com.wulinfeng.hadoop.wordsplit

|--wordsplit.java

|--ikanalyzer.cfg.xml

|--myext.dic

|--mystopword.dic

Wordsplit.java

 PackageCom.wulinfeng.hadoop.wordsplit;Importjava.io.IOException;ImportJava.io.StringReader;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.FileSystem;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;Importorg.apache.hadoop.io.WritableComparable;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.Mapper;ImportOrg.apache.hadoop.mapreduce.Reducer;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.map.InverseMapper;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;ImportOrg.wltea.analyzer.core.IKSegmenter;ImportOrg.wltea.analyzer.core.Lexeme; Public classWordsplit {/*** Map Implementation participle *@authorAdministrator **/     Public Static classTokenizermapperextendsMapper<object, text, text, intwritable> {        Private Static FinalIntwritable one =NewIntwritable (1); PrivateText Word =NewText ();  Public voidMap (Object key, text value, Mapper<object, text, text, intwritable>. Context context)throwsIOException, interruptedexception {stringreader input=NewStringReader (value.tostring ()); Iksegmenter ikseg=NewIksegmenter (Input,true);//Intelligent Word Segmentation             for(Lexeme lexeme = Ikseg.next (); Lexeme! =NULL; Lexeme =Ikseg.next ()) {                 This. Word.set (Lexeme.getlexemetext ()); Context.write ( This. Word, one); }        }    }    /*** Reduce to achieve word segmentation cumulative *@authorAdministrator **/     Public Static classIntsumreducerextendsReducer<text, Intwritable, Text, intwritable> {        Privateintwritable result =Newintwritable ();  Public voidReduce (Text key, iterable<intwritable>values, Reducer<text, Intwritable, Text, intwritable>. Context context)throwsIOException, interruptedexception {intsum = 0;  for(intwritable val:values) {sum+=Val.get (); }             This. Result.set (sum); Context.write (Key, This. Result); }    }     Public Static voidMain (string[] args)throwsException {Configuration conf=NewConfiguration (); String Inputfile= "/input/people.txt";//input FilePath OutDir =NewPath ("/out");//Output DirectoryPath TempDir =NewPath ("/tmp" + system.currenttimemillis ());//Temp directory//First task: participleSYSTEM.OUT.PRINTLN ("Start Task ..."); Job Job= Job.getinstance (conf, "Word split"); Job.setjarbyclass (wordsplit.class); Job.setmapperclass (tokenizermapper.class); Job.setcombinerclass (intsumreducer.class); Job.setreducerclass (intsumreducer.class); Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (intwritable.class); Fileinputformat.addinputpath (Job,NewPath (inputfile));        Fileoutputformat.setoutputpath (Job, tempdir); //end of first task, output as input to second task, start sort taskJob.setoutputformatclass (Sequencefileoutputformat.class); if(Job.waitforcompletion (true) {System.out.println ("Start sort ..."); Job Sortjob= Job.getinstance (conf, "word sort"); Sortjob.setjarbyclass (wordsplit.class); Sortjob.setmapperclass (inversemapper.class); Sortjob.setinputformatclass (Sequencefileinputformat.class); //reverses the Map key value, calculates the word frequency and descendingSortjob.setmapoutputkeyclass (intwritable.class); Sortjob.setmapoutputvalueclass (Text.class); Sortjob.setsortcomparatorclass (intwritabledecreasingcomparator.class); Sortjob.setnumreducetasks (1); //output to out directory fileSortjob.setoutputkeyclass (intwritable.class); Sortjob.setoutputvalueclass (Text.class);            Fileinputformat.addinputpath (Sortjob, tempdir); //If you already have an out directory, delete and then createFileSystem FileSystem =outdir.getfilesystem (conf); if(Filesystem.exists (OutDir)) {Filesystem.delete (OutDir,true);            } fileoutputformat.setoutputpath (Sortjob, OutDir); if(Sortjob.waitforcompletion (true) {System.out.println ("Finish and quit ..."); //Delete Temp directoryFileSystem =tempdir.getfilesystem (conf); if(Filesystem.exists (tempdir)) {Filesystem.delete (TempDir,true); } system.exit (0); }        }    }    /*** Implemented in descending order * *@authorAdministrator **/    Private Static classIntwritabledecreasingcomparatorextendsIntwritable.comparator { Public intCompare (Writablecomparable A, writablecomparable b) {return-Super. Compare (A, b); }         Public intComparebyte[] B1,intS1,intL1,byte[] B2,intS2,intL2) {            return-Super. Compare (B1, S1, L1, B2, S2, L2); }    }}

IKAnalyzer.cfg.xml

<?XML version= "1.0" encoding= "UTF-8"?><!DOCTYPE Properties SYSTEM "Http://java.sun.com/dtd/properties.dtd"><Properties>    <Comment>IK Analyzer Extended Configuration</Comment>    <!--The user can configure their own extension dictionary here -    <entryKey= "Ext_dict">Myext.dic</entry>    <!--The user can configure their own extension stop word dictionary here -    <entryKey= "Ext_stopwords">Mystopword.dic</entry></Properties>

Myext.dic

Gao Yuliang Qitongwei Chen Haichen rock Houliangping Gao Xiaoqin Sharekin Success

Mystopword.dic

You, me, he is.

This is where you run the Wordsplit class directly in Eclipse, right-click Run as-run on Hadoop. Because in the class to write dead input files, so need to build a input directory in the D disk, put a file called People.txt's novel, is the online swing down the hot drama "People's name", in order to participle need to put people.txt to notepad++ open, dot code- Encoded in UTF-8 with no BOM format. In the Myext.dic input some don't want to split the name, in the Mystopword.dic input want to filter out some verbs and auxiliary words, run out to D:\out to see part-r-00000 file can know who is pig's foot.

Eclipse Configuration hadoop2.7.2 Development environment

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Eclipse Configuration hadoop2.7.2 Development environment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Eclipse Configuration hadoop2.7.2 Development environment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support