Hadoop Cluster Run test code (Hadoop authoritative Guide Weather Data example)

Last Update:2018-07-20 Source: Internet

Author: User

Tags prepare static class

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Today the Hadoop authoritative Guide Weather Data sample code runs through the Hadoop cluster and records it.

Before the Baidu/google how also did not find how to map-reduce way to run in the cluster every step of the specific description, after a painful headless fly-style groping, success, a good mood ...

1 Preparing the Weather forecast data (simplified version of the data on the authoritative guide 5-9 for year,15-19 to temperature)

aaaaa1990aaaaaa0039a
bbbbb1991bbbbbb0040a
ccccc1992cccccc0040c
ddddd1993dddddd0043d
eeeee1994eeeeee0041e
aaaaa1990aaaaaa0031a
bbbbb1991bbbbbb0020a
ccccc1992cccccc0030c
ddddd1993dddddd0033d
eeeee1994eeeeee0031e
aaaaa1990aaaaaa0041a
bbbbb1991bbbbbb0040a
ccccc1992cccccc0040c
ddddd1993dddddd0043d
eeeee1994eeeeee0041e
aaaaa1990aaaaaa0044a
bbbbb1991bbbbbb0045a
ccccc1992cccccc0041c
ddddd1993dddddd0023d
eeeee1994eeeeee0041e

2 writing map-reduce functions and scheduling functions (JOB)

Simple point: The following

Package hadoop.test;

Import java.io.

Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class Maxtemperature {

Static class Maxtemperaturemapper extends Mapper<longwritable, text, text, intwritable>
{
private static final int MISSING = 9999;

public void Map (longwritable key, Text value, Context conext) throws IOException, Interruptedexception
{
String line = value.tostring ();
String year = line.substring (5, 9); The data you prepare is a simplified version of the weather forecast data.
int airtemperature = Integer.parseint (line.substring (15, 19)); The data you prepare is a simplified version of the weather forecast data.

if (airtemperature! = MISSING)
{
Conext.write (new Text (year), New Intwritable (airtemperature));
}

}

}

Static class Maxtemperaturereducer extends Reducer<text,intwritable,text,intwritable>
{
public void reduce (Text key, iterable<intwritable> Values,context Context) throws IOException, Interruptedexception
{
int maxValue = Integer.min_value;
for (intwritable value:values)
{
MaxValue = Math.max (MaxValue, Value.get ());
}
Context.write (Key, New Intwritable (MaxValue));
}
}
/**
* @param args
*/
public static void Main (string[] args) {
TODO auto-generated Method Stub

if (args.length! = 2)
{
System.err.println ("Usage:maxtemperature <input path> <output path>");
System.exit (-1);
}

try {
Job Job = new Job ();
Job.setjarbyclass (Maxtemperature.class);

Fileinputformat.addinputpath (Job, New Path (Args[0]));
Fileoutputformat.setoutputpath (Job, New Path (Args[1]));

Job.setmapperclass (Maxtemperaturemapper.class);
Job.setreducerclass (Maxtemperaturereducer.class);

Job.setoutputkeyclass (Text.class);
Job.setoutputvalueclass (Intwritable.class);

System.exit (Job.waitforcompletion (true)? 0:1);

} catch (IOException e) {
TODO auto-generated Catch block
E.printstacktrace ();
}catch (ClassNotFoundException e)
{
E.printstacktrace ();
}catch (interruptedexception e)
{
E.printstacktrace ();
}

}

}

3 Package The second-step code into a Hadooptest.jar directory, such as/home/hadoop/documents/

Then export hadoop_classpath=/home/hadoop/documents/

(Choose MainClass when packaging, do not choose as if there is an error when executing, Eclipse's export option has the MainClass option

Otherwise: the MainClass class name including the package path needs to be specified after ***.jar when running the Hadoop jar command

For example, Hadoop jar/home/hadoop/documents/hadooptest.jar hadoop.test.maxtemperature/user/hadoop/temperature output

）

4 data that will be analyzed is sent to HDFs

Hadoop dfs-put/home/hadoop/documents/temperature./temperature

5 Start execution

Hadoop jar/home/hadoop/documents/hadooptest.jar/user/hadoop/temperature Output

It's not exactly the same as the order in the book, but he is referring to the local way, and not knowing what the export hadoop_classpath=/home/hadoop/documents/is for, executing the HADOOP jar Hadooptest.jar/ User/hadoop/temperature output is no drop, specifically why, continue to explore it, first of all.

Here Hadooptest.jar is locally, the data file to be analyzed is temperature on hdfs, and the output generated is on HDFS, and outputs is a folder

hadoop@hadoop1:~$ Hadoop dfs-cat./output/part-r-00000
1990 44
1991 45
1992 41
1993 43
1994 41

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More