Document directory
- Refer:
- 1 MapReduce Overview
- Ii. How MapReduce works
- Three MapReduce Framework Structure
- 4. JobClient
- TaskTracker
Note: I wanted to analyze HDFS and Map-Reduce in detail in the Hadoop learning summary series. However, when searching for information, I found this article, we also found that caibinbupt has analyzed the Hadoop source code in detail. We recommend that you read this article.
From http://blog.csdn.net/HEYUTAO007/archive/2010/07/10/5725379.aspx
Refer:
1 caibinbupt source code analysis http://caibinbupt.javaeye.com/
2. avaeye of coderplay
Http://coderplay.javaeye.com/blog/295097
Http://coderplay.javaeye.com/blog/318602
3 Javen-Studio coffee house
Http://www.cppblog.com/javenstudio/articles/43073.html
1 MapReduce Overview
Map/Reduce is a distributed computing model used for large-scale data processing. It was originally designed and implemented by Google engineers and has been published by Google. The definition of Map/Reduce is a programming model, which is used to process and generate large-scale data sets. You can define a map function to process a key/value pair to generate a batch of intermediate key/value pairs, and then define a reduce function to combine all the values with the same key in the middle. Many tasks in the real world can be expressed using this model.
Ii. How MapReduce works
The Map-Reduce framework operates completely based on the <key, value> pair, that is, the data input is a batch of <key, value> pairs, and the generated results are also a batch of <key, value> pairs, but sometimes they have different types. Key and value classes must be implemented because they need to support serialize operationsWritableInterface, and the key class must also be implementedWritableComparableAllows the framework to sort datasets.
The execution process of a Map-Reduce task and the data input and output types are as follows:
Map: <k1, v1>-> list <k2, v2>
REduce: <K2, list <v2>-> <k3, v3>
The following is an example to describe this process in detail.
WordCount is an example of Hadoop. It aims to count the number of words in a text file. Suppose there are two text files to run the WorkCount program:
Hello World Bye World
Hello Hadoop GoodBye Hadoop
1 map Data Input
Hadoop is used by default for text filesLineRecordReaderClass to achieve read, one key/value pair in a row, the offset of the key, and the value is the row content.
The following is the input data of map1:
Key1
Value1
0
Hello World Bye World
The following is the input data of map2:
Key1
Value1
0
Hello Hadoop GoodBye Hadoop
2 map output/combine Input
The output result of map1 is as follows:
Key2
Value2
Hello
1
World
1
Bye
1
World
1
The output result of map2 is as follows:
Key2
Value2
Hello
1
Hadoop
1
GoodBye
1
Hadoop
1
3 combine output
The Combiner class combines the values of the same key, which is also an CER implementation.
The output of combine1 is as follows:
Key2
Value2
Hello
1
World
2
Bye
1
The output of combine2 is as follows:
Key2
Value2
Hello
1
Hadoop
2
GoodBye
1
4. reduce output
CER class combines the values of the same key.
The following is the output of reduce.
Key2
Value2
Hello
2
World
2
Bye
1
Hadoop
2
GoodBye
1
Three MapReduce Framework Structure 1 role 1.1 JobTracker
JobTrackerIs a master service,JobTrackerEach subtask that is responsible for scheduling a job runs onTaskTrackerAnd monitor them. If a failed task is found, run it again. GenerallyJobTrackerDeployed on a separate machine.
1.2
TaskTracker
TaskTrackerYesSlaver service running on multiple nodes.TaskTrackerIt is responsible for directly executing each task.TaskTrackerAll need to run in HDFSDataNode,
1.3 JobClient
Each job will pass throughJobClientClass to package the application and configuration parameters into jar files and store them in HDFS, and submit the pathJobTracker, And thenJobTrackerCreate eachTask(That isMapTaskAndReduceTask) And distribute them to variousTaskTrackerService.
2. Data Structure 2.1 Mapper and Reducer
The most basic components of MapReduce applications running on Hadoop includeMapperAnd oneReducerClass, and a creationJobConfIn some applications, you can also includeCombinerClass, which is actuallyReducer.
2.2 JobInProgress
JobClientAfter submitting a job,JobTrackerCreatesJobInProgressTo track and schedule the job and add it to the job Queue.JobInProgressAccording to the input dataset defined in the submitted job jarFileSplit) Create a batchTaskInProgressUsed for monitoring and SchedulingMapTaskAnd create a specified numberTaskInProgressUsed for monitoring and SchedulingReduceTask. The default value is 1.ReduceTask.
2.3 TaskInProgress
JobTrackerWhen starting a taskTaskInProgressTo launchTask.TaskObject (that isMapTaskAndReduceTask) Serialized and written to the correspondingTaskTrackerService,TaskTrackerThe correspondingTaskInProgress(ThisTaskInProgressImplement non-JobTrackerUsed inTaskInProgressIs used for monitoring and schedulingTask. Start a specificTaskThe process isTaskInProgressManagedTaskRunnerObject.TaskRunnerJob jar is automatically loaded, environment variables are set, and an independent java child process is started for execution.Task, That isMapTaskOrReduceTaskBut they do not necessarily run in the sameTaskTracker.
2.4 MapTask and ReduceTask
A complete job is automatically executed in sequence.Mapper,Combiner(InJobConfSpecifiedCombinerAndReducer, WhereMapperAndCombinerYesbyMapTaskCall execution,ReducerThenReduceTaskCall,CombinerActuallyReducerInterface Class implementation.MapperAccording to the input data set defined in job jar, it is read by <key1, value1>. After processing, a temporary <key2, value2> pair is generated. IfCombiner,MapTaskTheMapperCallCombinerMerge the values of the same key to reduce the output result set.MapTaskAll tasks are completed and handed overReduceTaskProcess callReducerProcess and generate the final result <key3, value3> pair. This process will be detailed in the next section.
Describes the main components of the Map/Reduce framework and their relationships:
3 Process
A MapRedcue job is submitted to the JobTracker of the master node through JobClient. rubJob (job). After receiving a request from JobClient, JobTracker adds it to the job queue. JobTracker has been waiting for JobClient to submit a job through RPC, while TaskTracker has been sending heartbeat requests to JobTracker through RPC to ask if there are any tasks to do. If so, send the task to it for execution. If the job queue of JobTracker is not empty, the heartbeat sent by TaskTracker gets the task assigned by JobTracker. This is a pull process. After the slave node receives the Task, the TaskTracker initiates the Task locally to execute the Task. The following is a brief description:
The following describes how Map/Reduce processes a task in detail.
4. JobClient
MapReduce programs are usually written as follows:
Configuration conf = new Configuration (); // read hadoop Configuration
Job job = new Job (conf, "Job name"); // instantiate a job
Job. setMapperClass (Mapper type );
Job. setCombinerClass (Combiner type );
Job. setReducerClass (CER type );
Job. setOutputKeyClass (type of output Key );
Job. setOutputValueClass (type of output Value );
FileInputFormat. addInputPath (job, new Path (input hdfs Path ));
FileOutputFormat. setOutputPath (job, new Path (output hdfs Path ));
// Other initialization configurations
JobClient. runJob (job );
1. configure a Job
JobConf is an interface used to describe a job. The following information is the key custom information in the MapReduce process:
2 JobClient. runJob (): run the Job and break down the input dataset
A MapReduce Job passesJobClientClass according to the user'sJobConfClassInputFormatImplementation class to break down input datasets into a batch of small datasets. Each small data set corresponds toMapTask.JobClientThe defaultFileInputFormatClass callFileInputFormatThe. getSplits () method generates a small dataset. If you determine that the data file is isSplitable (), the large file is decomposed into small ones.FileSplitOf course, it only records the path, offset, and Split size of the file in HDFS. The information is packaged in the jar of jobFile.
JobClient then uses the submitJob (job) method to submit a job to the master. The submitJob (job) internally submits substantive jobs through the submitJobInternal (job) method. The submitJobInternal (job) method first uploads three files, job. jar, job. split, and job. xml, to the hdfs of the hadoop distribution system.
Job. xml: job configuration, such as Mapper, Combiner, CER type, and input/output format type.
Job. jar: The jar package contains various types required to execute this task, such as Mapper and CER Cer.
Job. split: Information about file blocks, such as the number of data blocks and the size of the blocks (64 MB by default.
The paths of these three files on hdfs are determined by the mapreduce system Path mapred. system. dir Attribute + jobid in the hadoop-default.xml file. The default mapred. system. dir property is/tmp/hadoop-user_name/mapred/system. After writing these three files, this method calls the JobTracker. submitJob (job) method on the master node through RPC, and the job has been submitted.
3. Submit a Job
The submission process of jobFile is implemented through the RPC module (which is described in detail in a unique chapter. The general process is,JobClientClass is called through the Proxy interface implemented by RPCJobTrackerThe submitJob () method, andJobTrackerThe JobSubmissionProtocol interface must be implemented.
JobTrackerAfter a job is createdJobClientReturnsJobStatusAn object is used to record the status information of a job, such as the execution time, the ratio of Map tasks to Reduce tasks, and so on.JobClientBased on thisJobStatusCreate an objectNetworkedJobOfRunningJobObject used for timed synchronizationJobTrackerObtain the statistical data of the execution process to monitor and print it to the user's console.
Shows the classes and methods related to the Job creation process.
V.
JobTracker
As mentioned above, jobs areJobTrackerTo schedule, specificTaskDistribute to variousTaskTrackerNode. Next we will detail the parsing and execution process, first fromJobTrackerYesJobClient.
1 JobTracker initialize Job1.1 JobTracker. submitJob () receives the request
WhenJobTrackerAfter receiving a new job request (that is, the submitJob () function is called),JobInProgressObject To manage and schedule tasks.JobInProgressDuring creation, a series of task-related parameters will be initialized and called to FileSystem to download all task files uploaded on the JobClient to a temporary directory in the local file system. This includes the uploaded *. jar file package, the xml file that records the configuration information, and the file that records the split information.
1.2 JobTracker. JobInitThread notifies the initialization thread
The listener class EagerTaskInitializationListener in JobTracker is responsible for Task initialization.JobTrackerUse jobAdded (job) to join a job to EagerTaskInitializationLIstenerMediumA dedicated management needsIn the initial queue, that is, in the list member variable jobInitQueue. The resortInitQueue method is sorted by job priority. Then call the policyall () function to call a thread for job initialization.JobInitThread.JobInitThreadAfter receiving the signal, the top job is retrieved, that is, the job with the highest priority. The initJob that calls TaskTrackerManager finally calls JobInProgress. initTasks () to execute the real initialization.
1.3 JobInProgress. initTasks () initialize TaskInProgress
There are two types of Task tasks: MapTask and reduceTask. Their management objects are TaskInProgress.
FirstJobInProgressWill create a Map monitoring object. In the initTasks () function, callJobClientReadSplitFile () to obtain the input data that has been decomposedRawSplitList, and then create the corresponding number of Map Execution Management Objects Based on this listTaskInProgress. In this process,RawSplitThe host of the DataNode node where all blocks in HDFS are located.RawSplitWhen creatingFileSplitGetLocations () function to obtain, this function will callDistributedFileSystemGetFileCacheHints () ). If it is stored in a local file system, useLocalFileSystemOf course, there is only one location, that is, "localhost.
After creating these TaskInProgress objects, the initTasks () method uses the createCache () methodTaskInProgressThe object generates an unexecuted Map cache nonRunningMapCache. When the slave-side TaskTracker sends heartbeat to the master, it can directly execute the task from the cache.
SecondJobInProgressWill create the Reduce monitoring object, this is relatively simple, accordingJobConfBy default, only one Reduce task is created. Which of the following is monitoring and scheduling Reduce tasks?TaskInProgressClass, but the constructor method is different,TaskInProgressWill create specificMapTaskOrReduceTask. Similarly, initTasks () generates nonrunning‑cecache members through the createCache () method.
JobInProgressCreatedTaskInProgressAnd finally constructJobStatusRecord that the job is being executed, and then callJobHistory.JobInfo. LogStarted () records the job execution log. HereJobTrackerThe job initialization process is complete.
2 JobTracker scheduling Job
HadoopThe default scheduler is the jobqueuetaskschedener of the FIFO policy. It has two member variables: jobQueueJobInProgressListener and eagerTaskInitializationListener.JobQueueJobInProgressListenerIs another listener class of JobTracker. It contains a ing to manage and dispatch all JobInProgress. JobAdded (job) is added to JobQueueJobInProgressListener..
JobQueueTaskSchedulerThe most important method is assignTasks, which implements job scheduling. Specific implementation: After JobTracker receives the heartbeat () call from TaskTracker, it first checks whether the previous heartbeat response is complete. It does not require the task to be started or restarted. If everything is normal, the heartbeat is processed. First, it checks the number of map and reduce tasks that can be executed by TaskTracker, whether the number of tasks to be distributed exceeds this number, and whether the number of remaining available tasks in the cluster exceeds the average number of available tasks. If not, a MapTask or ReduceTask is assigned to the TaskTracker. The Map task is generated using the obtainNewMapTask () method of JobInProgress. In fact, findNewMapTask () of JobInProgress is called to access nonRunningMapCache.
As mentioned above during Task initialization, The createCache () method will be attached with the TaskInProgress to be executed on the network topology. FindNewMapTask () is located from the nearest layer to the nearest layer. First, it is the same node, then it is looking for nodes on the same cabinet, and then it is looking for nodes under the same data center, until the end of the maxLevel layer is found. In this way, when JobTracker distributes a task to TaskTracker, it can quickly find the nearest TaskTracker to execute the task.
Finally, a Task class object is generated. The object is encapsulated in a LanuchTaskAction and sent back to TaskTracker to execute the Task.
Similar to the process of generating Reduce tasks, the JobInProgress. obtainNewReduceTask () method is used to call findNewReduceTask () of JobInProgress to access nonRuningReduceCache.
TaskTracker1 TaskTracker loads a Task to a sub-process
The Task is actually executedTaskTrackerInitiated,TaskTrackerPeriodically (the default value is 10 seconds. See the HEARTBEAT_INTERVAL variable defined in the MRConstants class) andJobTrackerPerform a communication, report the execution status of your tasks, and receiveJobTracker. If you find that you want to execute a new task, it will be started at this time, that is, inTaskTrackerCallJobTrackerWhen the heartbeat () method is used, the underlying calling is implemented by calling the Proxy interface through the IPC layer. The following describes each step.
1.1 TaskTracker. run () connects to JobTracker
TaskTrackerWill initialize a series of parameters and services, and then try to connectJobTracker(Required)InterTrackerProtocolInterface). If the connection is disconnected, the connection will be tried cyclically.JobTrackerAnd reinitialize all members and parameters.
1.2 TaskTracker. offerService () Main Loop
If the connectionJobTrackerService successful,TaskTrackerThe offerService () function is called to enter the main execution cycle. This loop matchesJobTrackerCall transmitHeartBeat () to obtainHeartbeatResponseInformation. Then callHeartbeatResponseGetActions () function to obtainJobTrackerAll the commands passed in are oneTaskTrackerActionArray. Then traverse this array. If it is a new task command, that isLaunchTaskActionCall addToTaskQueue to join the queue to be executed; otherwise, add it to the tasksToCleanup queue and submit it to a taskCleanupThread for processing. For example, executeKillJobActionOrKillTaskAction.
1.3 TaskTracker. transmitHeartBeat () Get JobTracker command
In transmitHeartBeat () function processing,TaskTrackerA newTaskTrackerStatusThe object records the execution status of the current Task and checks the number of tasks currently executed and the space usage of the local disk. If you can receive a new Task, set the askForNewTask parameter of heartbeat () to true. Then, call the API through the IPC interface.JobTrackerHeartbeat () method to send the past, heartbeat () Return ValueTaskTrackerActionArray.
1.4 TaskTracker. addToTaskQueue, Which is handed over to TaskLauncher for processing
TaskLauncher is a thread class used to process new tasks. It contains the queue tasksToLaunch of the tasks to be run. TaskTracker. addToTaskQueue calls TaskTracker's registerTask, creates a TaskInProgress object to schedule and monitor the task, and adds it to the runningTasks queue. At the same time, this TaskInProgress is added to tasksToLaunch, and notifyAll () wakes up a thread to run. This thread extracts a task to be run from the queue tasksToLaunch and calls the startNewTask of TaskTracker to run the task.
1.5 TaskTracker. startNewTask () Start a new task
Call localizeJob () to initialize the Task and start execution.
1.6 TaskTracker. localizeJob () initialize the job directory, etc.
The main task of this function is to initialize the working directory workDir, copy the job jar package from HDFS to the local file system, and call RunJar. unJar () to decompress the package to the working directory. Create a RunningJob and call the addTaskToJob () function to add it to the runningJobs monitoring queue. The addTaskToJob method adds a task to the task list of the runningJob to which the task belongs. If the runningJob to which the task belongs does not exist, create a new job and add it to runningJobs. Call launchTaskForJob () to execute the Task.
1.7 TaskTracker. launchTaskForJob () executes the task
The Task is actually started by calling the launchTask () function of TaskTracker $ TaskInProgress.
1.8 TaskTracker $ TaskInProgress. launchTask () executes the task
Call localizeTask () to update the jobConf file and write it to the local directory before executing the task. Then, create the TaskRunner object by calling the createRunner () method of the Task, call its start () method, and finally start the independent java sub-process of the Task to execute the Task.
1.9 Task. createRunner () create a startup Runner object
TaskThere are two implementation versions:MapTaskAndReduceTaskThey are used to create Map and Reduce tasks respectively.MapTaskWill createMapTaskRunnerTo start the Task sub-process, andReduceTaskCreateReduceTaskRunnerTo start.
1.10 TaskRunner. start () promoter Process
TaskRunner is responsible for placing a task in a process for execution. It calls the run () function for processing. The main task is to initialize a series of environment variables for starting the java sub-process, including setting the working directory workDir and setting the CLASSPATH environment variables. Then load the job jar package. JvmManager is used to manage all running Task sub-processes on the TaskTracker. Every process is managed by JvmRunner, which is also located in a separate thread. In JvmManager's launchJvm method, the corresponding JvmRunner is generated based on whether the task is map or reduce and managed in the process container corresponding to JvmManagerForType. ReapJvm () of JvmManagerForType ()
Allocate a new JVM process. If the JvmManagerForType slot is full, find the idle process. If it is the same as the Job, put it directly. Otherwise, kill the process and replace it with a new process. If the slot is not full, start a new sub-process. Use the spawnNewJvm method to generate a new process. SpawnNewJvm uses the run method of the JvmRunner thread. The run method is used to generate a new process and run it. The specific implementation is to call runChild.
2. The sub-process executes MapTask.
The actual execution carrier is Child. It contains a main function. During process execution, the relevant parameters are passed in, and these parameters are disassembled, and are passed through getTask (jvmId) obtain the Task from the parent process, construct the relevant Task instance, and start the Task using run () of the Task.
2.1 run
The method is quite simple. After the system TaskReporter is configured, run runJobCleanupTask, runJobSetupTask, runTaskCleanupTask, or Mapper as needed. Since MapReduce now has two sets of APIS, MapTask needs to support these two sets of APIS, so MapTask execution er can be divided into runNewMapper and runOldMapper. We analyze runOldMapper.
2.2 runOldMapper
The first part of runOldMapper is to construct the InputSplit processed by Mapper, and then create the RecordReader of Mapper to obtain the map input. Then, the Mapper output is constructed through MapOutputCollector. In two cases, if there is no Reducer, use DirectMapOutputCollector. Otherwise, use MapOutputBuffer.
After the Mapper input and output are constructed, the Mapper can be executed by constructing the MapRunnable configured in the configuration file. Currently, the system has two MapRunnable: MapRunner and MultithreadedMapRunner. MapRunner is a single-thread executor and is relatively simple. It uses the reflection mechanism to generate a user-defined Mapper interface implementation class as one of its members.
2.3 run method of MapRunner
A corresponding key and value object will be created first, and then, for each pair of InputSplit <key, value>, the user-implemented Mapper interface will be called to implement the map method of the class, each time you process a data pair, you need to use OutputCollector to collect the new kv pairs obtained after each kv pair is processed, and spill them into files or put them into the memory for further processing, such as sorting, combine.
2.4 OutputCollector
The role of OutputCollector is to collect new kv pairs obtained after map is called every time. Instead, you can spill them to files or put them in the memory for further processing, such as sorting and combine.
MapOutputCollector has two subclasses: MapOutputBuffer and DirectMapOutputCollector. DirectMapOutputCollector is used when the Reduce stage is not required. If Mapper has reduce tasks in the future, the system uses MapOutputBuffer as the output. MapOutputBuffer uses a buffer to cache the processing results of the map and stores them in the memory, use several arrays to manage the buffer.
At the right time, data in the buffer zone is pushed to the hard disk.
Time to write data to Hard Disk:
(1) When the memory buffer cannot accommodate a large kv pair. SpillSingleRecord method.
(2) When the memory buffer is full. SpillThread.
(3) The Mapper results have already been collect, and the final cleaning of the buffer zone is required. Flush method.
2.5 spillThread thread: spill the data in the buffer to the hard disk.
(1) When spill is required, the sortAndSpill function is called and sorted by partition and key. Quick sorting is used by default.
(2) If there is no combiner, the record is output directly. Otherwise, the CombinerRunner's combine is called, and then the combin is output first.
3. The sub-process executes cetcetask.
The ReduceTask. run method is similar to MapTask, including initialize () initialization, runJobCleanupTask (), runJobSetupTask (), and runTaskCleanupTask (). After that, I started my work in three steps: Copy, Sort, and Reduce.
3.1Copy
The output file of Map is collected from the server that executes each map task. The copy task is undertaken by the ReduceTask. cececopier class.
3.1.1 class diagram:
3.1.2 process: Start With performancecopier. fetchOutputs
(1) Request a task. Use the GetMapEventsThread thread. The run method of this thread continuously calls the getMapCompletionEvents method. This method uses RPC to call the getMapCompletionEvents protocol of TaskUmbilicalProtocol, the method uses the jobID of the job to ask its parent TaskTracker about the completion status of the Map tasks of the job (TaskTracker needs to ask JobTracker and then tell it ...). Returns an array TaskCompletionEvent events []. TaskCompletionEvent contains information such as taskid and IP address. (2) When the information of the related Map task execution server is obtained, a thread MapOutputCopier is enabled for specific copying. It is responsible for copying files on a Map Task Server in a separate thread. The run of MapOutputCopier calls copyOutput cyclically, copyOutput calls getMapOutput again, and uses HTTP remote copy.
(3) getMapOutput remote copy of the content (of course, it can also be local ...), as a MapOutput object, it can be serialized in the memory or on the disk, which is automatically adjusted based on the memory usage.
(4) At the same time, there is also a memory Merger thread InMemFSMergeThread and a file Merger thread LocalFSMerger in synchronization, they will download the files (possibly in the memory, simply referred to as a file ...), merge and sort to save time, reduce the number of input files, and reduce the negative value for subsequent sorting. The run loop of InMemFSMergeThread calls doInMemMerge. This method uses the tool class Merger to implement combine. If combine is required, combinerRunner. combine is used.
3.2 Sort
Sorting is equivalent to a continuation of the preceding sorting. It will be executed after all the files are copied. Use the tool-type Merger to merge all files. After this process, a new file that combines all the required Map task output files is generated. All Map task output files collected from other servers are deleted.
3.3Reduce
The last stage of the Reduce task. He will prepare keyClass ("mapred. output. key. class "or" mapred. mapoutput. key. class "), valueClass (" mapred. mapoutput. value. class "or" mapred. output. value. class ") and Comparator (" mapred. output. value. groupfn. class or mapred. output. key. comparator. class "). Finally, call the runOldReducer method. (It is also two sets of APIs. We will analyze runOldReducer)
3.3.1 runOldReducer
(1) output.
It will prepare an OutputCollector to collect the output. Unlike MapTask, this OutputCollector is simpler. It only opens a RecordWriter, collect once, and write once. The biggest difference is that the file systems passed into RecordWriter are basically distributed file systems, or HDFS.
(2) In terms of input, cetcetask constructs the key types required by CER using custom classes such as the prepared KeyClass, ValueClass, and KeyComparator, the iteration type Iterator of the value (a key usually corresponds to a group of values here ).
(3) With input and output, custom reducers are continuously called cyclically. Finally, the Reduce stage is complete.