High-availability Hadoop platform-sail

Last Update:2015-03-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Overview

In our previous blog, we built the "Configure high-availability Hadoop platform" and then we were able to navigate the big data ocean by driving the great ship of Hadoop. 工欲善其事, its prerequisite. Yes, yes, we need a development tool (IDE) for our development, this article, I'm going to explain how to build and use the development environment, and to write and explain the example of WordCount, to a door for the children's shoes that are about to gallop in the ocean of Hadoop. Last time, I said in the "Website log Statistics case analysis and implementation" that will be the source code to GitHub, later, I considered the next, decided to "high-availability Hadoop platform" to do a series, behind this platform, I will write a separate article to repeat the specific implementation process, and in the implementation of some problems encountered in the process, and solutions to these problems. Let's start today's sailing .

2. Sailing

Ide:jboss Developer Studio 8.0.0.GA (updated version of Eclipse, Redhat Company)

jdk:1.7 (or 1.8)

Hadoop2x-eclipse-plugin: This plugin, local unit test or do your own academic research is more useful

Plugin: Https://github.com/smartdengjie/hadoop2x-eclipse-plugin

Since JBoss Developer Studio 8 is a basic fit for the retina screen, we're not very good at using JBoss Developer studio 8,jboss Developer Studio 7 to support retina screens right here, This is not to be discussed here.

Attach an IDE's:

2.1 Installing Plugins

Let's start by installing the plugin, first showing the first open interface as shown in:

Then, we go to the GitHub address above, clone the entire project, there is a compiled jar and source code, you can choose (using the existing and build the corresponding version of their own), here I directly use the compiled version. We put the jar in the IDE's plugins directory, as shown in:

Then, we restart the IDE, the interface appears as shown, that is, the plug-in added success, if not, look at the IDE's boot log, based on the Exception log to locate the cause.

2.2 Setting up a Hadoop plug-in

The configuration information is as follows (illustrated in the diagram):

　　Add a local Hadoop source directory:

Here, the IDE and the plug-in is completed, the following we enter a simple development, the source of Hadoop provides a lot of example let me learn, here I take wordcount as an example to illustrate:

3.WordCount

First we look at the Hadoop source file directory, as shown in:

3.1 Source Code Interpretation

 Packagecn.hdfs.mr.example;Importjava.io.IOException;ImportJava.util.Random;ImportJava.util.StringTokenizer;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.Mapper;ImportOrg.apache.hadoop.mapreduce.Reducer;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;ImportOrg.slf4j.Logger;Importorg.slf4j.LoggerFactory;Importcn.hdfs.utils.ConfigUtils;/** *  * @authorDengjie * @date March 13, 2015 * @description WordCount example is a classic MapReduce example, which can be called the Hadoop version of Hello World. * It splits the words in the file, then Shuffle,sort (the map process), then goes to the summary statistics * (reduce process) and writes in HDFs. This is the basic process. */ Public classWordCount {Private StaticLogger log = Loggerfactory.getlogger (WordCount.class);  Public Static classTokenizermapperextendsMapper<object, text, text, intwritable> {    Private Final StaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText (); /** source file: a b b * * Map: * * a 1 * * B 1 * * B 1*/     Public voidMap (Object key, Text value, context context)throwsIOException, interruptedexception {stringtokenizer ITR=NewStringTokenizer (Value.tostring ());//Full line Read         while(Itr.hasmoretokens ()) {Word.set (Itr.nexttoken ());//divide words by SpaceContext.write (Word, one);//Each statistic comes out of the word +1        }    }    }    /** Reduce BEFORE: * * a 1 * * B 1 * * B 1 * * Reduce: * * a 1 * * B 2*/     Public Static classIntsumreducerextendsReducer<text, Intwritable, Text, intwritable> {    Privateintwritable result =Newintwritable ();  Public voidReduce (Text key, iterable<intwritable> values, context context)throwsIOException, interruptedexception {intsum = 0;  for(intwritable val:values) {sum+=Val.get ();        } result.set (sum);    Context.write (key, result); }} @SuppressWarnings ("Deprecation")     Public Static voidMain (string[] args)throwsException {Configuration Conf1=NewConfiguration (); Configuration Conf2=NewConfiguration (); LongRandom1 =NewRandom (). Nextlong ();//Reset Output Directory 1    LongRandom2 =NewRandom (). Nextlong ();//Reset Output Directory 2Log.info ("random1" + random1 + ", random2" +random2); Job Job1=NewJob (CONF1, "Word count1"); Job1.setjarbyclass (WordCount.class); Job1.setmapperclass (tokenizermapper.class);//specify the class for the map calculationJob1.setcombinerclass (Intsumreducer.class);//merged ClassesJob1.setreducerclass (Intsumreducer.class);//Class of reduceJob1.setoutputkeyclass (Text.class);//Output Key TypeJob1.setoutputvalueclass (intwritable.class);//Output Value typeJob job2=NewJob (Conf2, "Word count2"); Job2.setjarbyclass (WordCount.class); Job2.setmapperclass (tokenizermapper.class); Job2.setcombinerclass (intsumreducer.class); Job2.setreducerclass (intsumreducer.class); Job2.setoutputkeyclass (Text.class); Job2.setoutputvalueclass (intwritable.class); //Fileinputformat.addinputpath (Job, New//Path (String.Format (ConfigUtils.HDFS.WORDCOUNT_IN, "test.txt")); //Specify the input pathFileinputformat.addinputpath (JOB1,NewPath (String.Format (ConfigUtils.HDFS.WORDCOUNT_IN, "word"))); //Specify the output pathFileoutputformat.setoutputpath (JOB1,NewPath (String.Format (ConfigUtils.HDFS.WORDCOUNT_OUT, random1)); Fileinputformat.addinputpath (JOB2,NewPath (String.Format (ConfigUtils.HDFS.WORDCOUNT_IN, "word"))); Fileoutputformat.setoutputpath (JOB2,NewPath (String.Format (ConfigUtils.HDFS.WORDCOUNT_OUT, random2)); BooleanFlag1 = Job1.waitforcompletion (true);//exit the application after performing the MR Task    BooleanFlag2 = Job1.waitforcompletion (true); if(Flag1 &&Flag2) {System.exit (0); } Else{system.exit (1); }    }}

4. Summary

This article and everybody to share here, if in the process of research what problem, can add group discussion or send mail to me, I will do my best to answer for you, with June encouragement!

High-availability Hadoop platform-sail

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

High-availability Hadoop platform-sail

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

High-availability Hadoop platform-sail

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support