Architecture of MapReduce
Hadoop MapReduce is an easy-to-use software framework that can be run on a large cluster of thousands of commercial machines, based on the applications it writes out.
And in a reliable fault-tolerant way in parallel processing of the upper terabytes of data sets.
Programs implemented with the MapReduce architecture enable parallelization in a large number of general-configured computers.
The MapReduce system only cares about how the data is segmented, how it is dispatched, and how the computer in the cluster handles the error.
Manages the communication between computers.
The MapReduce framework consists of a separate master jobtracker and slave tasktracker on the cluster nodes.
Master is responsible for scheduling all tasks in a job and distributing these tasks on different slave.
Master monitors the execution of these tasks on the slave node and re-executes the failed task, while Slave is only responsible for performing the tasks assigned by master.
1. MapReduce is a programming mode
2, Map/reduce
Architecture of the architecture of Hadoop