International - English

Cart Console

Topic Center

Contact Sales

Home > Internet > Big Data

Hadoop generation cluster Run code case

Last Update:2014-12-22 Source: Internet

Author: User

Keywords Nbsp; dfs group transport value

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Cluster a master, two Slave,ip respectively are 192.168.1.2, 192.168.1.3, 192.168.1.4&http://www.aliyun.com/zixun/aggregation/37954. HTML >nbsp; The Hadoop version is 1.2.1

First, start Hadoop

Into the Hadoop bin directory

Second, the establishment of data files, and upload to HDFs

1, in the file directory for/home/hadoop under the establishment of folder file, and in file to establish a document HADOOP_02

Cd/home/hadoop

mkdir file

CD file

2. Write Data:

The data format is:

2012-3-1 A

2012-3-2 b

2012-3-3 C

2012-3-4 D

2012-3-5 A

2012-3-6 b

2012-3-7 C

2012-3-3 C

You can iterate and paste the data so that the amount of data is much more.

(Learn Hadoop have no data to do?) Nutch grasping, paid software crawl, simulate generation according to need,,

3, Upload HDFs

(1), HDFs if there is no input directory, create a

Hadoop Fs–mkdir Input

(2), view HDFs files

Hadoop Fs–ls

(3), upload hadoop_02 to input

Hadoop fs–put~/file/hadoop_02 Input

(4), view the input file

Hadoop fs–ls Input

4, to view the eclipse has just uploaded to the HDFs file hadoop_02, the contents are as follows:

5, create MapReduce project, write code:

The data goes to the code as follows:

Import java.io.IOException;

Import org.apache.hadoop.conf.Configuration;

Import Org.apache.hadoop.fs.Path;

Import org.apache.hadoop.io.IntWritable;

Import Org.apache.hadoop.io.Text;

Import Org.apache.hadoop.mapreduce.Job;

Import Org.apache.hadoop.mapreduce.Mapper;

Import Org.apache.hadoop.mapreduce.Reducer;

Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

Import Org.apache.hadoop.util.GenericOptionsParser;

public class Dedup {

Map copies the value of the input to the key of the output data and outputs directly

public static class Map extends mapper<object,text,text,text>{

private static text line=new text ()//data per line

Implementing the Map function

public void Map (Object key,text value,context context)

Throws ioexception,interruptedexception{

Line=value;

Context.write (line, New Text (""));

}

Reduce copies the key in the input to the key of the output data and outputs directly

public static class Reduce extends reducer<text,text,text,text>{

Implement the Reduce function

public void reduce (Text key,iterable<text> values,context context)

Throws ioexception,interruptedexception{

Context.write (Key, New Text (""));

}

public static void Main (string] args) throws exception{

Revisit conf = new revisit ();

That's a key word.

Conf.set ("Mapred.job.tracker", "192.168.1.2:9001");

String] ioargs=new string[]{"dedup_in", "Dedup_out"};

String] Otherargs = new Genericoptionsparser (conf,

Ioargs). Getremainingargs ();

if (otherargs.length!= 2) {

System.err.println ("Usage:data deduplication <in> <out>");

System.exit (2);

}

Job Job = new Job (conf, "Data deduplication");

Job.setjarbyclass (Dedup.class);

Set up the map, combine, and reduce processing classes

Job.setmapperclass (Map.class);

Job.setcombinerclass (Reduce.class);

Job.setreducerclass (Reduce.class);

Set Output Type

Job.setoutputkeyclass (Text.class);

Job.setoutputvalueclass (Text.class);

Setting up the input and output directories

Fileinputformat.addinputpath (Job, New Path (otherargs[0));

Fileoutputformat.setoutputpath (Job, New Path (otherargs[1));

System.exit (Job.waitforcompletion (true)? 0:1);

}

6, run the code

Right Key Item class

Set the input output HDFs path

The console output section is as follows:

Look at the hadoop_22 file in output, the results are as follows:

7. Turn off Hadoop

This completes the code.

Original link: http://www.cnblogs.com/baolibin528/p/4004707.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

There is no shortage of data mining talent in China, but it i... 04-30

In order to get "big data", Strategic investment Love Station... 04-30

The internet industry in China, we still keep the data very t... 04-29

News client The biggest gold mine is big data 04-27

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Hot Article

Hot Tags

computing conference access forum computer class data get http html applications

Popular Keywords

html add blank space register business logo register ssl certificate full site sign in sign up node js build cloud register register a subdomain in python network management system tutorial how to learn computer science by myself

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop generation cluster Run code case

Contact Us

Hot Article

Hot Tags

Popular Keywords

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support