Parsing Hadoop's next generation MapReduce framework yarn

Last Update:2016-01-31 Source: Internet

Author: User

Tags shuffle

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Background

Yarn is a distributed resource management system that improves resource utilization in distributed cluster environments, including memory, IO, network, disk, and so on. The reason for this is to solve the shortcomings of the original MapReduce framework. The original MapReduce Committer can also be periodically modified on the existing code, but as the code increases and the original MapReduce framework is not designed, it becomes more difficult to modify the original MapReduce framework, So MapReduce's committer decided to redesign MapReduce from the architecture, enabling the next generation of MapReduce (Mrv2/yarn) frameworks to have better extensibility, usability, reliability, Backward compatibility and higher resource utilization, and support for more computational frameworks beyond the MapReduce computing framework.

Deficiencies of the original MapReduce framework

1.Jobtracker is a centralized processing point for cluster transactions with a single point of failure
2.Jobtracker need to complete too many tasks, both to maintain the status of the job and to maintain the status of the job task, resulting in excessive resource consumption
3. on the Tasktracker side, using the Map/reduce task as the resource representation is too simple, does not take into account the CPU, memory and other resources, when the two need to consume large memory of task scheduling together, it is easy to appear oom
4. force the resource into the Map/reduce slot, the reduce slot is not available when only the map task is available, and the map slot is not available when only the reduce task is used, which can lead to insufficient resource utilization.

Yarn Architecture

Yarn/mrv2 's most basic idea is to separate the original Jobtracker primary resource management and job scheduling/monitoring functions as two separate daemons . There is a global ResourceManager (RM) and each application has a applicationmaster (AM), application equivalent to MapReduce Job or Dag Jobs. ResourceManager and NodeManager (NM) Form the basic Data computing framework. ResourceManager coordinate the resource utilization of the cluster, any client or running Applicationmaster who wants to run the job or task will have to request a certain amount of resources from RM. Applicationmaster is a framework-specific library that has its own AM implementation for the MapReduce framework, and users can implement their am, which, when run, will start and monitor tasks with NM.

ResourceManager

ResourceManager as a resource coordinator has two main components: Scheduler and Applicationsmanager (AsM).

Schedule is responsible for allocating the minimum amount of resources required to meet the application run to application. Scheduler is only scheduled based on resource usage, and is not responsible for monitoring/tracking the status of application and, of course, does not handle failed tasks. RM uses the Resource container concept to manage the resources of the cluster, Resource container is the abstraction of resources, each container includes a certain amount of memory, IO, network resources, but the current implementation includes only one resource of memory.

Applicationsmanager is responsible for handling client-submitted jobs and negotiating the first container for Applicationmaster to run. And when Applicationmaster fails, it restarts applicationmaster. The following describes some of the functions that RM specifically accomplishes.

1. Resource Scheduling: Scheduler builds a global resource allocation plan from all running application, and then allocates resources based on application special restrictions and some constraints on the global environment.

2. Resource monitoring: Scheduler periodically receives monitoring information for resource usage from NM, and applicationmaster can obtain status information from scheduler of completed container that belong to it.

3.Application submission:

1):client to ASM obtains a applicationidclient application definition and the required jar package.

2):the client uploads the application definition and the required jar package file to the specified directory in HDFs, specified by Yarn-site.xml's yarn.app.mapreduce.am.stagin-dir.

3):The object that the client constructs the resource request and the submission context of the application is sent to ASM.

4):ASM receives the application submission context.

5):ASM negotiates a container for applicationmaster according to application information, and then launches Applicationmaster. Scheduler.

6): send lanuchcontainer information to the NM container to launch the container, that is, to start Applicationmaster, ASM to the client to provide the state information of the running AM.

4.am's declaration period: ASM is responsible for the management of the declaration cycle of all am in the system. ASM is responsible for the start of AM, when AM is started, am will periodically send heartbeat to ASM, the default is 1s,asm to understand the survival of AM, and on AM failure to restart AM, if after a certain time (the default is 10 minutes) did not receive AM's heartbeat, ASM thought the AM was a failure.

The availability of ResourceManager is not yet well implemented, but the Cloudera company's CDH4.4 later version achieves a simple high availability, using the code of the HA section of the Hadoop-common project, using a similar HDFs Namenode High-availability design, the introduction of the active and standby state to RM, but there is no role corresponding to Journalnode, but only by the zookeeper to maintain the state of RM, such a design knowledge is the simplest solution to avoid the manual restart RM , there is still a distance from the actual production available.

NodeManager

NM is primarily responsible for starting RM assignment am container and container representing AM, and monitoring the operation of the container. When starting container, NM will set up some necessary environment variables and download the jar packages, files, etc. required for container to run from HDFs to local, so-called resource localization; when all the preparation is done, The script that represents the container will start up the program. When it starts up, NM periodically monitors the resource usage of the container run, and if it exceeds the amount of resources declared by the container, it will kill the process represented by container.

In addition, NM provides a simple service to manage the local directory of the machine on which it resides. Applications can continue to access the local directory even though that machine has no container on it running. For example, the MapReduce application uses this service to store map output and shuffle them to the corresponding reduce task.

You can also extend your services on NM, yarn provides a yarn.nodemanager.aux-service configuration that allows users to customize some services, such as the shuffle functionality of MapReduce, which is implemented in this way.

NM generates the following directory structure locally for each running application:

The directory structure under the container directory is as follows:

At the start of a container, NM executes the container default_container_executor.sh, which executes launch_container.sh inside the script. LAUNCH_CONTAINER.SH will set some environment variables and finally start the command to execute the program. For MapReduce, start am to execute org.apache.hadoop.mapreduce.v2.app.MRAppMaster; start map/reduce The task executes the org.apache.hadoop.mapred.YarnChild.

Applicationmaster

Applicationmaster is a framework-specific library that has its own applicationmaster implementation for the MapReduce computational model, and for other computational models that want to run on yarn, It is necessary to implement the Applicationmaster for this computational model to run a task to the RM application resource, such as the spark framework running on yarn, and the corresponding Applicationmaster implementation, in the final analysis, yarn is a resource management framework, is not a computational framework, but the implementation of a specific computing framework is needed to run the application on yarn. Since yarn is present along with MRV2, the following is a brief overview of MRV2 's running process on yarn.

MRV2 Running Process:

1.MR Jobclient submits a job to ResourceManager's Applicationsmanager (AsM).

2.ASM asks scheduler for a container to run for Mr Am, and then launches it.

3.MR am starts up and registers with ASM.

4.Mr Jobclient obtains information about Mr AM from ASM and communicates directly with Mr AM.

5.MR AM calculates the splits and constructs a resource request for all maps.

6.Mr Am is doing the necessary preparation for Mr Outputcommitter.

7.MR am initiates a resource request to RM Scheduler, obtains a set of container for Map/reduce task to run, and then performs some necessary tasks, including resource localization, with NM for each container.

8.MR am monitors the running task until it is complete and, when the task fails, requests that the new container run the failed task.

9. when each map/reduce task is completed, Mr am runs the cleanup code of Mr Outputcommitter, that is, a few finishing touches.

When all map/reduce are completed, MR am runs the necessary job commit or abort APIs for Outputcommitter.

One .MR am quits.

Write the application on yarn

Writing applications on yarn and unlike the MapReduce applications we know, it is important to keep in mind that yarn is a resource management framework and not a computational framework that can run on yarn. All we can do is apply container to RM and start container with NM. Just like MRv2, jobclient requests the container for the Mr am run, sets the environment variables and the Start command, and then goes to NM to start Mr Am, then the Map/reduce task is fully accountable to Mr, and the start of the task is also made by Mr AM applies container to RM and starts with NM. So to run a non-specific computing framework program on yarn, we implement our own client and Applicationmaster. In addition, our custom am needs to be placed under the claspath of each nm, as am may run on the same machine as any nm.

Parsing Hadoop's next generation MapReduce framework yarn

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More