Hadoop Cluster Run test code (Hadoop authoritative Guide Weather Data example)

Source: Internet
Author: User
Tags prepare static class

Today the Hadoop authoritative Guide Weather Data sample code runs through the Hadoop cluster and records it.

Before the Baidu/google how also did not find how to map-reduce way to run in the cluster every step of the specific description, after a painful headless fly-style groping, success, a good mood ...

1 Preparing the Weather forecast data (simplified version of the data on the authoritative guide 5-9 for year,15-19 to temperature)

aaaaa1990aaaaaa0039a
bbbbb1991bbbbbb0040a
ccccc1992cccccc0040c
ddddd1993dddddd0043d
eeeee1994eeeeee0041e
aaaaa1990aaaaaa0031a
bbbbb1991bbbbbb0020a
ccccc1992cccccc0030c
ddddd1993dddddd0033d
eeeee1994eeeeee0031e
aaaaa1990aaaaaa0041a
bbbbb1991bbbbbb0040a
ccccc1992cccccc0040c
ddddd1993dddddd0043d
eeeee1994eeeeee0041e
aaaaa1990aaaaaa0044a
bbbbb1991bbbbbb0045a
ccccc1992cccccc0041c
ddddd1993dddddd0023d
eeeee1994eeeeee0041e

2 writing map-reduce functions and scheduling functions (JOB)

Simple point: The following

Package hadoop.test;

Import java.io.

Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;


public class Maxtemperature {

Static class Maxtemperaturemapper extends Mapper<longwritable, text, text, intwritable>
{
private static final int MISSING = 9999;

public void Map (longwritable key, Text value, Context conext) throws IOException, Interruptedexception
{
String line = value.tostring ();
String year = line.substring (5, 9); The data you prepare is a simplified version of the weather forecast data.
int airtemperature = Integer.parseint (line.substring (15, 19)); The data you prepare is a simplified version of the weather forecast data.

if (airtemperature! = MISSING)
{
Conext.write (new Text (year), New Intwritable (airtemperature));
}

}

}

Static class Maxtemperaturereducer extends Reducer<text,intwritable,text,intwritable>
{
public void reduce (Text key, iterable<intwritable> Values,context Context) throws IOException, Interruptedexception
{
int maxValue = Integer.min_value;
for (intwritable value:values)
{
MaxValue = Math.max (MaxValue, Value.get ());
}
Context.write (Key, New Intwritable (MaxValue));
}
}
/**
* @param args
*/
public static void Main (string[] args) {
TODO auto-generated Method Stub

if (args.length! = 2)
{
System.err.println ("Usage:maxtemperature <input path> <output path>");
System.exit (-1);
}

try {
Job Job = new Job ();
Job.setjarbyclass (Maxtemperature.class);

Fileinputformat.addinputpath (Job, New Path (Args[0]));
Fileoutputformat.setoutputpath (Job, New Path (Args[1]));

Job.setmapperclass (Maxtemperaturemapper.class);
Job.setreducerclass (Maxtemperaturereducer.class);

Job.setoutputkeyclass (Text.class);
Job.setoutputvalueclass (Intwritable.class);

System.exit (Job.waitforcompletion (true)? 0:1);

} catch (IOException e) {
TODO auto-generated Catch block
E.printstacktrace ();
}catch (ClassNotFoundException e)
{
E.printstacktrace ();
}catch (interruptedexception e)
{
E.printstacktrace ();
}


}

}

3 Package The second-step code into a Hadooptest.jar directory, such as/home/hadoop/documents/

Then export hadoop_classpath=/home/hadoop/documents/

(Choose MainClass when packaging, do not choose as if there is an error when executing, Eclipse's export option has the MainClass option

Otherwise: the MainClass class name including the package path needs to be specified after ***.jar when running the Hadoop jar command

For example, Hadoop jar/home/hadoop/documents/hadooptest.jar hadoop.test.maxtemperature/user/hadoop/temperature output

4 data that will be analyzed is sent to HDFs

Hadoop dfs-put/home/hadoop/documents/temperature./temperature

5 Start execution

Hadoop jar/home/hadoop/documents/hadooptest.jar/user/hadoop/temperature Output

It's not exactly the same as the order in the book, but he is referring to the local way, and not knowing what the export hadoop_classpath=/home/hadoop/documents/is for, executing the HADOOP jar Hadooptest.jar/ User/hadoop/temperature output is no drop, specifically why, continue to explore it, first of all.

Here Hadooptest.jar is locally, the data file to be analyzed is temperature on hdfs, and the output generated is on HDFS, and outputs is a folder

hadoop@hadoop1:~$ Hadoop dfs-cat./output/part-r-00000
1990 44
1991 45
1992 41
1993 43
1994 41

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.