The Hadoop-mapreduce-examples-2.7.0.jar of Hadoop

Source: Internet
Author: User
Tags log log static class

The first 2 blog test of Hadoop code when the use of this jar, then it is necessary to analyze the source code.

It is necessary to write a wordcount before analyzing the source code as follows

Package mytest;
Import java.io.IOException;

Import Java.util.StringTokenizer;
Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount {public static class Tokenizermapper extends Mapper<object, text, text, INTWRITABLE&G T
    {Private final static intwritable one = new intwritable (1);

    Private text Word = new text ();
      public void Map (Object key, Text value, Context context) throws IOException, Interruptedexception {
      StringTokenizer ITR = new StringTokenizer (value.tostring ());
        while (Itr.hasmoretokens ()) {Word.set (Itr.nexttoken ()); Context.write (wOrd, one); }}} public static class Intsumreducer extends Reducer<text,intwritable,text,intwritable> {pri

    Vate intwritable result = new intwritable ();
                       public void reduce (Text key, iterable<intwritable> values, context context
      ) throws IOException, interruptedexception {int sum = 0;
      for (intwritable val:values) {sum + = Val.get ();
      } result.set (sum);
    Context.write (key, result);
    }} public static void Main (string[] args) throws Exception {Configuration conf = new configuration ();
    Job Job = job.getinstance (conf, "word count");
    Job.setjarbyclass (Wordcount.class);
    Job.setmapperclass (Tokenizermapper.class);
    Job.setcombinerclass (Intsumreducer.class);
    Job.setreducerclass (Intsumreducer.class);
    Job.setoutputkeyclass (Text.class);
    Job.setoutputvalueclass (Intwritable.class); Fileinputformat.addinputpath (Job, New Path (Args[0]));
    Fileoutputformat.setoutputpath (Job, New Path (Args[1]));
  System.exit (Job.waitforcompletion (true)? 0:1); }
}

Retrieve the relevant source code, found to need 2 jars, respectively Hadoop-common-2.7.0.jar and Hadoop-mapreduce-client-core-2.7.0.jar

After exporting to the runnable jar using MyEclipse, execute

~/hadoop-2.7.0/bin/hadoop jar My.jar mytest. Wordcount/user/hadoop/input/user/hadoop/output3
Test success

Because there is a "package mytest", you need to use mytest when executing. Worcount.


Careful recollection of the previous execution of the command did not add a similar mytest. This kind of thing can be carried out successfully. We go to retrieve the source code to see. Perform.

Find ~/  -name   *hadoop-mapreduce-examples*
Output content is

/home/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar
/home/hadoop/ Hadoop-2.7.0/share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.0-sources.jar
/home/hadoop/ Hadoop-2.7.0/share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.0-test-sources.jar
/home/hadoop/ Hadoop-2.7.0/share/doc/hadoop/hadoop-mapreduce-examples


Unzip Hadoop-mapreduce-examples-2.7.0-sources.jar after import myeclipse view source code.

Retrieves the "grep" field and finds that it appears in Exampledriver.java and looks like this file is the entry for this jar.

So how does the runnable jar determine the entry for this file? After extracting the runnable jar, the following description is found in Meta-inf

Main-class:org.apache.hadoop.examples.exampledriver

The original runnable jar can be configured with a default entry. You can set a default entry when exporting a jar through myeclipse.

The Exampledriver.java will be imported into its own project, revision, testing. Perform

~/hadoop-2.7.0/bin/hadoop jar My.jar Wordcount/user/hadoop/input/user/hadoop/output4


A lot of things specific to see the source code more detailed, after a special place can be carefully analyzed.


Tip: Analysis log logs can be found.

The SYSO in map and reduce are output to the log log.

The SYSO in main is output to the screen.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.