Hadoop:mapreduce Programming-wordcount count Words-eclipse-java Environment

Source: Internet
Author: User

Before using the Hadoop streaming environment to write Python programs, the following summarizes the editing Java Eclipse Environment Configuration Summary, and a wordcount example run.

  Download Eclipse installation package and Hadoop plugin

1 go to the official website to download the Linux version of the Eclipse installation package (or in my convenience for everyone to download, uploaded to the csdn download, url:

2 Download plugin: Hadoop-eclipse-plugin-2.6.0.jar

Two-install Elicpse and Hadoop plugins

1 extract eclipse to path/user/local/eclipse

2 Copy the plugin Hadoop-eclipse-plugin-2.6.0.jar to the eclipse path:/user/local/eclipse/plugins/hadoop-eclipse-plugin-2.6.0.jar

3 Start Eclipse

./user/local/eclipse/eclipse-clean

  Three configuring Eclipse's Hadoop environment

1 Select Preference under the Window menu

Configure the Hadoop path:/usr/local/hadoop:

2 Toggle Map/reduce Development view. Select the Open perspective-other-> map/reduce option under the Window menu to switch.

3 Establish a connection to the Hadoop cluster. Click the Map/reduce Locations panel in the lower-right corner of the Eclipse software, right-clicking in the Panel, select New Hadoop location

4 View effect, so there is a benefit is to visualize the file system, or can only enter the command to view, but I still think the input command is better, use it together. The visual file system effect is as follows:

    

  Four WordCount example Run

1 Create a project: Click the File menu, select New Project, select Map/reduce Project, click Next, fill out project name as WordCount, and click Finish to create the project.

2 Create class classes: Then right-click on the WordCount project you just created, choose New-Class, fill in two places: Fill out org.apache.hadoop.examples at the package, fill in the Name Wordcou Nt.

3 Fill Code:

Package org.apache.hadoop.examples; import java.io.ioexception;import java.util.StringTokenizer; import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.path;import org.apache.hadoop.io.IntWritable; Import Org.apache.hadoop.io.text;import Org.apache.hadoop.mapreduce.job;import Org.apache.hadoop.mapreduce.Mapper ; Import Org.apache.hadoop.mapreduce.reducer;import Org.apache.hadoop.mapreduce.lib.input.fileinputformat;import Org.apache.hadoop.mapreduce.lib.output.fileoutputformat;import Org.apache.hadoop.util.GenericOptionsParser;  Public classWordCount { Public Static classTokenizermapper extends Mapper<object, text, text, intwritable>{     PrivateFinalStaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText ();  Public voidmap (Object key, Text value, Context context) throws IOException, interruptedexception {stri Ngtokenizer ITR=NewStringTokenizer (value.tostring ());  while(Itr.hasmoretokens ()) {Word.Set(Itr.nexttoken ());      Context.write (Word, one); }    }  }    Public Static classIntsumreducer extends Reducer<Text,IntWritable,Text,IntWritable> {    Privateintwritable result =Newintwritable ();  Public voidReduce (Text key, iterable<intwritable>values, Context context) throws IOException, Interruptedexception { intsum =0;  for(intwritable val:values) {sum+ = val.Get(); } result.Set(sum);    Context.write (key, result); }  }    Public Static voidMain (string[] args) throws Exception {Configuration conf=NewConfiguration (); String[] Otherargs=Newgenericoptionsparser (conf, args). Getremainingargs (); if(Otherargs.length! =2) {System.err.println ("usage:wordcount <in> <out>"); System.exit (2); } Job Job=NewJob (Conf,"Word Count"); Job.setjarbyclass (WordCount.class); Job.setmapperclass (tokenizermapper.class); Job.setcombinerclass (intsumreducer.class); Job.setreducerclass (intsumreducer.class); Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (intwritable.class); Fileinputformat.addinputpath (Job,NewPath (otherargs[0])); Fileoutputformat.setoutputpath (Job,NewPath (otherargs[1])); System.exit (Job.waitforcompletion (true) ?0:1); }}

4 before running the terminal enter the following command in order to modify the default local system by configuration file for the Hadoop file system and do not output a warning;

Cp/usr/local/hadoop/etc/hadoop/core-site.xml ~/workspace/wordcount//usr/local/hadoop/etc/hadoop/ Hdfs-site.xml ~/workspace/wordcount//usr/local/hadoop/etc/hadoop/log4j.properties ~/workspace/WordCount /src

  

5 set parameters, inputs and outputs. In particular , this input and output are actual file system paths, specifically/user/hadoop/input and/user/hadoop/output

6 output in the file system, view the results

Reference: http://www.powerxing.com/hadoop-build-project-using-eclipse/pictures from this blog, too troublesome

Hadoop:mapreduce Programming-wordcount count Words-eclipse-java Environment

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.