In general, we need to use small datasets to unit test the map and reduce functions we have written. Generally, we can use the Mockito framework to simulate the OutputCollector object (Hadoop version earlier than 0.20.0) and Context object (greater than or equal to 0.20.0 ). The following is a simple WordCount example: (using a new API) at the beginning
In general, we need to use small datasets to unit test the map and reduce functions we have written. Generally, we can use the Mockito framework to simulate the OutputCollector object (Hadoop version earlier than 0.20.0) and Context object (greater than or equal to 0.20.0 ). The following is a simple WordCount example: (using a new API) at the beginning
In general, we need to use small datasets to unit test the map and reduce functions we have written. Generally, we can use the Mockito framework to simulate the OutputCollector object (Hadoop version earlier than 0.20.0) and Context object (greater than or equal to 0.20.0 ).
The following is a simple WordCount example: (The New API is used)
Before you start, you need to import the following packages:
1. All jar packages under the Hadoop installation directory and lib directory.
2. JUnit4
3. Mockito
?
Map function:
Public class WordCountMapper extends Mapper {private static final IntWritable one = new IntWritable (1); private Text word = new Text (); @ Overrideprotected void map (LongWritable key, Text value, context context) throws IOException, InterruptedException {String line = value. toString (); // String [] words = line. split (";"); // parse the word for (String w: words) {word. set (w); context. write (word, one );}}}
? Reduce function:
Public class WordCountReducer extends CER {@ Overrideprotected void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; Iterator iterator = values. iterator (); // a set of values with the same key while (iterator. hasNext () {int one = iterator. next (). get (); sum + = one;} context. write (key, new IntWritable (sum ));}}
? Test code:
public class WordCountMapperReducerTest {@Testpublic void processValidRecord() throws IOException, InterruptedException {WordCountMapper mapper = new WordCountMapper();Text value = new Text("hello");org.apache.hadoop.mapreduce.Mapper.Context context = mock(Context.class);mapper.map(null, value, context);verify(context).write(new Text("hello"), new IntWritable(1));}@Testpublic void processResult() throws IOException, InterruptedException {WordCountReducer reducer = new WordCountReducer();Text key = new Text("hello");// {"hello",[1,1,2]}Iterable values = Arrays.asList(new IntWritable(1),new IntWritable(1),new IntWritable(2));org.apache.hadoop.mapreduce.Reducer.Context context = mock(org.apache.hadoop.mapreduce.Reducer.Context.class);reducer.reduce(key, values, context);verify(context).write(key, new IntWritable(4));// {"hello",4}}}
?
Specifically, it is used to input a row of data to the map function-"hello"
The map function processes the data and outputs {"hello", 0}
The reduce function accepts the output data of the map function, sums the values of the same key, and outputs the data.
Existing
0People comment, slam->
Here<-Participate in the discussion
ITeye recommendation
- -Software talents free of language and low guarantee paid study in the United States! -
Original article address: Hadoop MapReduce unit test. Thank you for sharing it.