Running modes of MAP and reduce methods in mapreduce framework

Source: Internet
Author: User

The map and reduce methods in mapreduce program overload the Mapper class and CER class
Map
And reduce methods.

The map and reduce methods in mapreduce programs are run in the following way by default in the framework:
For the specific implementation of a <key, value> map method or reduce method, see the ER er Class and reducer class under the package org. Apache. hadoop. mapreduce package.

Implementation Mechanism: The run method of the ER er Class and CER class will execute the map method and reduce method cyclically for all input <key, value> pairs.

See Hadoop-0.20.1 source code

 

Mapper class:

 

Package org. Apache. hadoop. mapreduce

 

Public class mapper <keyin, valuein, keyout, valueout> {

Public class context
Extends mapcontext <keyin, valuein, keyout, valueout> {
Public context (configuration Conf, taskattemptid taskid,
Recordreader <keyin, valuein> reader,
Recordwriter <keyout, valueout> writer,
Outputcommitter committer,
Statusreporter reporter,
Inputsplit) throws ioexception, interruptedexception {
Super (Conf, taskid, reader, writer, committer, reporter, split );
}
}

/**
* Called once at the beginning of the task.
*/
Protected void setup (context Context
) Throws ioexception, interruptedexception {
// Nothing
}

/**
* Called once for each key/value pair in the input split. Most applications
* Shocould override this, but the default is the identity function.
*/
@ Suppresswarnings ("unchecked ")
Protected void map (keyin key, valuein value,
Context context) throws ioexception, interruptedexception {
Context. Write (keyout) Key, (valueout) value );
}

/**
* Called once at the end of the task.
*/
Protected void cleanup (context Context
) Throws ioexception, interruptedexception {
// Nothing
}

/**
* Expert users can override this method for more complete control over
* Execution of the mapper.
* @ Param Context
* @ Throws ioexception
*/
Public void run (context) throws ioexception, interruptedexception {
Setup (context );
While (context. nextkeyvalue ()){
Map (context. getcurrentkey (), context. getcurrentvalue (), context );
}
Cleanup (context );
}

}

 

CER class:

 

Public class reducer <keyin, valuein, keyout, valueout> {

Public class context
Extends performancecontext <keyin, valuein, keyout, valueout> {
Public context (configuration Conf, taskattemptid taskid,
Rawkeyvalueiterator input,
Counter inputcounter,
Recordwriter <keyout, valueout> output,
Outputcommitter committer,
Statusreporter reporter,
Rawcomparator <keyin> comparator,
Class <keyin> keyclass,
Class <valuein> valueclass
) Throws ioexception, interruptedexception {
Super (Conf, taskid, input, inputcounter, output, committer, reporter,
Comparator, keyclass, valueclass );
}
}

/**
* Called once at the start of the task.
*/
Protected void setup (context Context
) Throws ioexception, interruptedexception {
// Nothing
}

/**
* This method is called once for each key. Most applications will define
* Their reduce class by overriding this method. The default implementation
* Is an identity function.
*/
@ Suppresswarnings ("unchecked ")
Protected void reduce (keyin key, iterable <valuein> values, context Context
) Throws ioexception, interruptedexception {
For (valuein value: values ){
Context. Write (keyout) Key, (valueout) value );
}
}

/**
* Called once at the end of the task.
*/
Protected void cleanup (context Context
) Throws ioexception, interruptedexception {
// Nothing
}

/**
* Advanced Application writers can use
* {@ Link
# Run (Org. Apache. hadoop. mapreduce. Cer. Context)} method
* Control how the reduce task works.
*/
Public void run (context) throws ioexception, interruptedexception {
Setup (context );
While (context. nextkey ()){
Reduce (context. getcurrentkey (), context. getvalues (), context );
}
Cleanup (context );
}

}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.