Hadoop MapReduce unit test

Source: Internet
Author: User
Tags iterable hadoop mapreduce
In general, we need to use small datasets to unit test the map and reduce functions we have written. Generally, we can use the Mockito framework to simulate the OutputCollector object (Hadoop version earlier than 0.20.0) and Context object (greater than or equal to 0.20.0 ). The following is a simple WordCount example: (using a new API) at the beginning

In general, we need to use small datasets to unit test the map and reduce functions we have written. Generally, we can use the Mockito framework to simulate the OutputCollector object (Hadoop version earlier than 0.20.0) and Context object (greater than or equal to 0.20.0 ). The following is a simple WordCount example: (using a new API) at the beginning

In general, we need to use small datasets to unit test the map and reduce functions we have written. Generally, we can use the Mockito framework to simulate the OutputCollector object (Hadoop version earlier than 0.20.0) and Context object (greater than or equal to 0.20.0 ).

The following is a simple WordCount example: (The New API is used)

Before you start, you need to import the following packages:

1. All jar packages under the Hadoop installation directory and lib directory.

2. JUnit4

3. Mockito

?

Map function:

Public class WordCountMapper extends Mapper {private static final IntWritable one = new IntWritable (1); private Text word = new Text (); @ Overrideprotected void map (LongWritable key, Text value, context context) throws IOException, InterruptedException {String line = value. toString (); // String [] words = line. split (";"); // parse the word for (String w: words) {word. set (w); context. write (word, one );}}}

? Reduce function:

Public class WordCountReducer extends CER {@ Overrideprotected void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; Iterator iterator = values. iterator (); // a set of values with the same key while (iterator. hasNext () {int one = iterator. next (). get (); sum + = one;} context. write (key, new IntWritable (sum ));}}

? Test code:

public class WordCountMapperReducerTest {@Testpublic void processValidRecord() throws IOException, InterruptedException {WordCountMapper mapper = new WordCountMapper();Text value = new Text("hello");org.apache.hadoop.mapreduce.Mapper.Context context = mock(Context.class);mapper.map(null, value, context);verify(context).write(new Text("hello"), new IntWritable(1));}@Testpublic void processResult() throws IOException, InterruptedException {WordCountReducer reducer = new WordCountReducer();Text key = new Text("hello");// {"hello",[1,1,2]}Iterable values = Arrays.asList(new IntWritable(1),new IntWritable(1),new IntWritable(2));org.apache.hadoop.mapreduce.Reducer.Context context = mock(org.apache.hadoop.mapreduce.Reducer.Context.class);reducer.reduce(key, values, context);verify(context).write(key, new IntWritable(4));// {"hello",4}}}

?

Specifically, it is used to input a row of data to the map function-"hello"

The map function processes the data and outputs {"hello", 0}

The reduce function accepts the output data of the map function, sums the values of the same key, and outputs the data.



Existing 0People comment, slam-> Here<-Participate in the discussion


ITeye recommendation
  • -Software talents free of language and low guarantee paid study in the United States! -



Original article address: Hadoop MapReduce unit test. Thank you for sharing it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.