The running process of MapReduce The running process of MapReduce
Basic concepts:
- Job&task: To complete a job, it will be divided into a number of task,task and divided into Maptask and Reducetask
- Jobtracker
- Tasktracker
Hadoop MapReduce Architecture
The role of Jobtracker
- Job scheduling
- Assign tasks, monitor task execution progress
- Monitor the status of Tasktracker
The role of Tasktracker
- Perform tasks
- Reporting task status
MapReduce Job Execution Process
The fault tolerant mechanism of mapreduce
Repeated execution
Errors can be a hardware problem, or it may be a problem with the data, the first will be repeated execution, if repeated 4 times or errors, then give up
Speculative execution
When the map end is counted, a particular node may appear to be particularly slow. This time jobtracker may think that the node is particularly slow may be a problem, then this time will be increased by a tasktracker execution, two nodes who first counted, the other task calculation discarded
The running flow of the Hadoop Note's MapReduce