The Hadoop-mapreduce-examples-2.7.0.jar of Hadoop

Last Update:2018-07-20 Source: Internet

Author: User

Tags log log static class

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The first 2 blog test of Hadoop code when the use of this jar, then it is necessary to analyze the source code.

It is necessary to write a wordcount before analyzing the source code as follows

Package mytest;
Import java.io.IOException;

Import Java.util.StringTokenizer;
Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount {public static class Tokenizermapper extends Mapper<object, text, text, INTWRITABLE&G T
    {Private final static intwritable one = new intwritable (1);

    Private text Word = new text ();
      public void Map (Object key, Text value, Context context) throws IOException, Interruptedexception {
      StringTokenizer ITR = new StringTokenizer (value.tostring ());
        while (Itr.hasmoretokens ()) {Word.set (Itr.nexttoken ()); Context.write (wOrd, one); }}} public static class Intsumreducer extends Reducer<text,intwritable,text,intwritable> {pri

    Vate intwritable result = new intwritable ();
                       public void reduce (Text key, iterable<intwritable> values, context context
      ) throws IOException, interruptedexception {int sum = 0;
      for (intwritable val:values) {sum + = Val.get ();
      } result.set (sum);
    Context.write (key, result);
    }} public static void Main (string[] args) throws Exception {Configuration conf = new configuration ();
    Job Job = job.getinstance (conf, "word count");
    Job.setjarbyclass (Wordcount.class);
    Job.setmapperclass (Tokenizermapper.class);
    Job.setcombinerclass (Intsumreducer.class);
    Job.setreducerclass (Intsumreducer.class);
    Job.setoutputkeyclass (Text.class);
    Job.setoutputvalueclass (Intwritable.class); Fileinputformat.addinputpath (Job, New Path (Args[0]));
    Fileoutputformat.setoutputpath (Job, New Path (Args[1]));
  System.exit (Job.waitforcompletion (true)? 0:1); }
}

Retrieve the relevant source code, found to need 2 jars, respectively Hadoop-common-2.7.0.jar and Hadoop-mapreduce-client-core-2.7.0.jar

After exporting to the runnable jar using MyEclipse, execute

~/hadoop-2.7.0/bin/hadoop jar My.jar mytest. Wordcount/user/hadoop/input/user/hadoop/output3

Test success

Because there is a "package mytest", you need to use mytest when executing. Worcount.

Careful recollection of the previous execution of the command did not add a similar mytest. This kind of thing can be carried out successfully. We go to retrieve the source code to see. Perform.

Find ~/  -name   *hadoop-mapreduce-examples*

Output content is

/home/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar
/home/hadoop/ Hadoop-2.7.0/share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.0-sources.jar
/home/hadoop/ Hadoop-2.7.0/share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.0-test-sources.jar
/home/hadoop/ Hadoop-2.7.0/share/doc/hadoop/hadoop-mapreduce-examples

Unzip Hadoop-mapreduce-examples-2.7.0-sources.jar after import myeclipse view source code.

Retrieves the "grep" field and finds that it appears in Exampledriver.java and looks like this file is the entry for this jar.

So how does the runnable jar determine the entry for this file? After extracting the runnable jar, the following description is found in Meta-inf

Main-class:org.apache.hadoop.examples.exampledriver

The original runnable jar can be configured with a default entry. You can set a default entry when exporting a jar through myeclipse.

The Exampledriver.java will be imported into its own project, revision, testing. Perform

~/hadoop-2.7.0/bin/hadoop jar My.jar Wordcount/user/hadoop/input/user/hadoop/output4

A lot of things specific to see the source code more detailed, after a special place can be carefully analyzed.

Tip: Analysis log logs can be found.

The SYSO in map and reduce are output to the log log.

The SYSO in main is output to the screen.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More