The first Map/Reduce Program

Last Update:2018-06-11 Source: Internet

Author: User

Tags hadoop fs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

After a development environment is set up on a MAC, the first thing to do is to find a helloworld program to practice. The helloword program in the hadoop world is the WordCount program below. 1. Create a project: The project name of FileNewOtherMapReduceProject can be obtained at will, such as MapReduceSample. Create WordCount. ja

After a development environment is set up on a MAC, the first thing to do is to find a hello world Program to practice. The hello word program in the hadoop world is the following Word Count program. 1. Create a Project: The Project name of the FileNewOtherMap/Reduce Project can be obtained at will, such as MapReduceSample. Create WordCount. ja

After a development environment is set up on a MAC, the first thing to do is to find a hello world Program to practice. The hello word program in the hadoop world is the following Word Count program.

1. Create a project

Step: File-> New-> Other-> Map/Reduce Project

Project names can be retrieved as needed, such as MapReduceSample. Create a new WordCount. java class with the following code:

Package com. lifeware. test;
Import java. io. IOException;
Import java. util .*;
Import org. apache. hadoop. fs. Path;
Import org. apache. hadoop. conf .*;
Import org. apache. hadoop. io .*;
Import org. apache. hadoop. mapred .*;
Import org. apache. hadoop. util .*;
Public class WordCount {

Public static class Map extends MapReduceBase implements Mapper {
Private final static IntWritable one = new IntWritable (1 );
Private Text word = new Text ();

? ? ? ? Public void map (LongWritable key, Text value, OutputCollector Output, Reporter reporter) throws IOException {
String line = value. toString ();
StringTokenizer tokenizer = new StringTokenizer (line );
While (tokenizer. hasMoreTokens ()){
Word. set (tokenizer. nextToken ());
Output. collect (word, one );
}
}
}

Public static class Reduce extends MapReduceBase implements Reducer {
Public void reduce (Text key, Iterator Values, OutputCollector Output, Reporter reporter) throws IOException {
Int sum = 0;
While (values. hasNext ()){
Sum + = values. next (). get ();
}
Output. collect (key, new IntWritable (sum ));
}
}
/**
* @ Param args
* @ Throws IOException
*/
Public static void main (String [] args) throws IOException {
// TODO Auto-generated method stub
JobConf conf = new JobConf (WordCount. class );
Conf. setJobName ("wordcount ");

? ? Conf. setOutputKeyClass (Text. class );
Conf. setOutputValueClass (IntWritable. class );
Conf. setMapperClass (Map. class );
Conf. setCombinerClass (Reduce. class );
Conf. setReducerClass (Reduce. class );

? ? Conf. setInputFormat (TextInputFormat. class );
Conf. setOutputFormat (TextOutputFormat. class );

? ? FileInputFormat. setInputPaths (conf, new Path (args [0]);
FileOutputFormat. setOutputPath (conf, new Path (args [1]);

? ? JobClient. runJob (conf );
}
}

2. Data Preparation

To run the program, we need an Input and Output Folder respectively. Output Folder, which is automatically generated after the program runs successfully. We need to input a folder for the program transmitter.

2.1 .? Prepare local files

Create a folder input in the current project directory, and create two files file1 and file2 under the folder. The content of these two files is as follows:

? File1 :? ? Hello World Bye World
File2 :? ? ? Hello Hadoop Goodbye Hadoop

2.2. Upload the folder input to the Distributed File System?

In the Hadoop daemon terminal that has started cd to the hadoop installation directory, run the following command:

Bin/hadoop fs-put ../test/input

After uploading the input Folder to the hadoop file system, an input Folder is added to the system. You can run the following command to view the folder:

Bin/hadoop fs-ls

You can also use the Eclipse plug-in to view the DFS Locations display:

3. Run the project

3.1 .? In the newly created Project MapReduceSample, right-click WordCount. java and choose Run As> Run deployments.

3.2. In the pop-up Run deployments dialog box, click Java Application, right-click-> New, and a New application named WordCount will be created.

3.3 .? Configure the running parameters, click Arguments, and enter "the Input Folder you want to pass to the Program and the folder you want the Program to save the computing result" in Program arguments, for example:

Hdfs: // localhost: 9000/user/metaboy/input hdfs: // localhost: 9000/user/metaboy/output

The input here is the file you just uploaded to the folder. You can enter the folder address as needed.

4. Run the program

Click Run to Run the program. After a period of time, the running is completed. After the running is completed, Run the following command on the terminal:

? ? ? Bin/hadoop fs-ls

You can also use the hadoop eclipse plug-in to check whether the folder output is generated.

? 5. view results

Run the following command to view the generated file content:

? ? Bin/hadoop fs-cat output /*

After running this program, it is basically a step into the Hadoop family!

Original article address: the first Map/Reduce program. Thank you for sharing it with the original author.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More