JobTracker and TaskTracker
JobTracker corresponds to NameNode
TaskTracker corresponds to DataNode
DataNode and NameNode are for data storage.
JobTracker and TaskTracker are for MapReduce execution.
Mapreduce has several main concepts. mapreduce can be divided into several execution clues as a whole:
Jobclient, JobTracker, and TaskTracker.
1. JobClient will package the configured parameters of the application into a jar file on the client side through the JobClient class and store the file in hdfs. Then, submit the path to Jobtracker, then, JobTracker creates each Task (MapTask and ReduceTask) and distributes them to various TaskTracker services for execution.
2. JobTracker is a master service. After the software is started, JobTracker receives the Job and is responsible for scheduling each subtask task of the Job to run on TaskTracker and monitoring them, if a failed task is found, run it again. Generally, JobTracker should be deployed on a separate machine.
3. TaskTracker is an slaver service running on multiple nodes. TaskTracker actively communicates with JobTracker, receives jobs, and executes each task directly.
TaskTracker must all run on HDFS DataNode.