The role of Setup run cleanup context in Hadoop execution
1. Introduction
1) Setup (), this method is only executed once by the MapReduce framework and performs a centralized initialization of the relevant variables or resources before performing the map task. If the resource initialization work is placed in the method map (), causing the mapper task to parse each line input will be the resource initialization work, resulting in duplication, the program is not efficient operation!
2) Mapper or reducer operation
3) Cleanup (), this method is only and once executed by the MapReduce framework, and after the map task is completed, the related variables or resources are freed. If the release of resources work into the method map (), it will also cause the mapper task in parsing, processing each line of text after the release of resources, and in the next line of text before parsing also repeated initialization, resulting in repeated, the program is not efficient operation!
4) Run program starts running
5) Contexts are a context in which the MapReduce task runs, containing all the information of the entire task
The context serves as a bridge between the functions in map and reduce execution, which is similar to the Session object and Application object in the Java Web.
Note: It is recommended that resource initialization and release work be carried out separately into the method setup () and Cleanup ().
2. Execution order
Setup---->mapper or reducer----->cleanup
| |
Run
Solution: Setup usually does some preparatory work before executing the map function, and map is the main data processing function.
Cleanup is doing some cleanup work after the map is done and the finally sentence acts like,
Let's take a look at the Run method
public void run (context context) throws IOException, Interruptedexception {
Setup (context);
while (Context.nextkeyvalue ()) {
Map (Context.getcurrentkey (), Context.getcurrentvalue (), context);
}
Cleanup (context);
}
}
Setup,cleanup,run and context explained in Hadoop