# Apache Hadoop Yarn:yet Another Resource negotiator paper interpretation

Last Update:2016-06-29 Source: Internet

Author: User

Tags spark notes

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Pure Cloud Platform Management Learning rookie notes, referring to many of Daniel's blog, if there is infringement, please contact, delete immediately.

Abstract

1) Tight coupling of a specific programming model with the Re-source management infrastructure, forcing developers to Abu SE the MapReduce programming model, and 2) centralized handling of jobs ' control flow, which resulted in endless Scalabili Ty concerns for the scheduler.
Personal Understanding: The Hadoop resource management scheduling and programming model before yarn is tightly coupled, so that the programmer's programming thinking mode is limited to MapReduce, which is not conducive to the addition of other programming models. Resource management and job control are concentrated on Jobtracker, which has poor extensibility.
Yarn architecture is dedicated to decoupling programming models and resource management, and entrusting programming models to other scheduling parts.

1 Introduction
We present the next generation of Hadoop compute platform known as YARN, which departs from its familiar, monolithic Archi Tecture. By separating resource management functions from the programming model, YARN delegates many scheduling-related functions t o per-job components.

Personal understanding: The transformation of the resource scheduling architecture--yarn the Central Scheduler (Monolithic Scheduler) of Hadoop 1.0 into a two-tier scheduler (Two-level Scheduler) mode. RM, as a lightweight central scheduler, is only responsible for resource allocation management, located at the top level. The underlying is the scheduler for various applications. Different programming models make up the different application scheduler am. These are responsible for requesting resources from RM and controlling the independent operation of their respective jobs.

2 History and rationale

From Yahoo! Engineering practice leads to the shortcomings of Hadoop 1.0, resulting in 10 needs to be improved.

Demand	Personal Understanding
1 Scalability	Single Namenode, the single jobtracker design severely constrains the entire Hadoop 1.0 scalability and reliability. First, Namenode and Jobtracker are obvious single point of failure sources (SPOF) throughout the system. Once again a single namenode memory capacity is limited, so that the number of nodes in the Hadoop cluster is limited to about 2000, the supported file system size is limited to 10-50PB, the maximum number of files supported is about 150 million (note, the actual number depends on Memory size of the Namenode).
2 multi-tenancy
3 serviceability	Upgrade dependencies need decoupling
4 Locality Awareness	Resource backup needs to be more reasonable
5 High Cluster Utilization	The delay of cluster resource allocation is too high, need to improve the utilization of cluster resource
6 reliability/availability	Reliability
7 Secure and auditable operation	Security
8 Support for programming Model diversity	Supports multiple programming models
9 Flexible Resource Model	Resource allocation needs more flexibility the number of map and reduce slots is fixed by the cluster operator, so fallow map capacity can ' t is used to s Pawn reduce tasks and Vice Versa.4
Ten backward compatibility	Version compatibility

3 Architecture

The RM runs as a daemon on a dedicated machine, and acts as the central authority arbitrating resources among various comp Eting applications in the cluster.
Depending on the application demand, scheduling priori ties, and resource availability, the RM dynamically allocates lease s–called Containersto applications to run on particular nodes.5 the container are a logical bundle of re-sources (e.g.,? 2 GB RAM, 1 CPU?) Bound to a particular node [R4,R9]. In order to enforce and track such assignments, the RM interacts with a special system daemon running on each node called The NodeManager (NM).

Jobs is submitted to the RM via a public submission protocol and go through an admission control phase during which secur ity credentials is validated and various operational and administrative checks are performed [R7].
The Applicationmaster is the ' head ' of a job, managing all lifecycle aspects including dynamically increasing and Decreasi Ng resources consumption, managing the flow of execution (e.g., running reducers against the output of maps), handling FAU LTS and computation skew, and performing other local optimizations.
Typically, an AM would need to harness the resources (CPUs, RAM, disks etc.) available on multiple nodes to complete a job. To obtain containers, AM issues resource requests to the RM.
Overall, a YARN deployment provides a basic, yet robust infrastructure for lifecycle management and monitoring of containe RS, while application-specific semantics is managed by each framework [R3,R8].

As a global resource manager, RM is responsible for the resource management and allocation of the whole system. Allocate container resources to specific nodes for use according to the application's resource request. NM is the resource and Task Manager on each node that communicates with the RM using a heartbeat packet and advertises the resource usage to RM. On the other hand, it receives and processes container start or stop requests from AM. Container is a logical unit of resources in yarn that encapsulates the resources of a node, such as memory and CPU. AM is responsible for the entire life cycle control of the job, including the application resources to the job, the management of the entire process of running the job. RM assigns am to a container with a lease, a token-based security mechanism that ensures that AM has a container for a node. Different programming models can develop different am to support multiple programming models.

Resource Manager (RM)
The ResourceManager exposes, public interfaces towards:1) clients submitting applications, and 2) Applicationmaster (s) Dynamically negotiating access to resources, and one internal interface towards Nodemanagers for cluster monitoring and R Esource Access Management.

Two outward interfaces: client submits application interface and allocates scheduling resource interface with AM dynamic coordination
An internal interface: communicates with NM and manages the use of resources in NM.

As discussed, it is not a responsible for coordinating application execution or task Fault-tolerance, but neither is a char GED with 1) providing status or metrics for running applications (now part of the Applicationmaster), nor 2) serving frame -Work specific reports of completed jobs (now delegated to a per-framework daemon). It's consistent with the view th At the ResourceManager should only handle live resource scheduling, and helps central components in YARN scale beyond the Hadoop 1.0 Jobtracker.

In short, unlike Jobtracker in RM, Scheduler is no longer involved in application execution monitoring and tracking, nor is it responsible for restarting failed tasks due to application execution failures or hardware failures, which are given to application-related AM completion.

Application Master (AM)

An application is a static set of processes, a logical description of work, or even a long-running service. The applicationmaster is the process that coordinates the application's execution in the cluster, but it itself was run in The cluster just like any other container.

The AM periodically heartbeats to the RM to affirm it liveness and to update the record of its demand.

In response to subsequent heartbeats, the AM would receive a container lease on bundles of resources bound to a particular node in the cluster. Based on the containers it receives from the RM, the IF update its execution plan to accommodate perceived abundance O R scarcity. In contrast to some resource models, the allocations to a application are late binding:the process spawned was not bound To the request and to the lease. The conditions that caused the I to issue the request is not remain true if it receives its resources, but the Semanti CS of the container is fungible and framework-specific [R3,r8,r10].

Am periodically sends a heartbeat to RM to indicate its presence and to update its resource requirements. After receiving the containers assigned by RM, am will adjust the execution plan of its tasks according to the amount of resources. The resources assigned to the application are deferred binding. In the case of resource availability and scheduling strategies, RM tries to meet the resource request presented by each app. When a resource is dispatched to an AM, RM generates a lease (lease) for that resource, and the subsequent AM heartbeat gets to the lease. When am presents container lease to NM, a token-based security mechanism guarantees the authenticity of the container lease.

Node Manager (NM)
The NodeManager is the "worker" of Daemon in YARN. It authenticates container leases, manages containers ' dependencies, monitors their execution, and provides a set of servi CES to containers.

The NM would also kill containers as directed by the RM or the AM.

NM also periodically monitors the health of the physical node.

NM exists as a work node in yarn, and it is responsible for validating container leases, managing container dependencies, and monitoring the execution of applications. At the same time, it will periodically report its resource availability to RM through the heartbeat information.

4 Yarn in the Real-world

While one of the initial goals of YARN is to improve scalability, Yahoo! reported that they is not running clusters any Bigger than 4000 nodes which used be the largest cluster ' s size before YARN.
This have simply removed the need to scale further for the moment, and have even allowed the operational team to delay the R E-provisioning of over 7000 nodes that has been decommissioned.

Essentially, moving to YARN, the CPU utilization almost doubled to the equivalent of 6 continuously pegged cores per box a nd with peaks up to ten fully utilized cores. In aggregate, this indicates that YARN is capable of keeping about 2.8*2500 = 7000 more cores completely busy running use R code. This is consistent with the increase in number of jobs and tasks running on the cluster we discussed above.

This is well summarized in the following quote: "Upgrading to YARN were equivalent to adding + machines [to this 2500 ma -Chines cluster] ".

5 experiments
Beating the sort record

7 Conclusion

Thanks to the decoupling of resource management and program-ming framework, YARN provides:1) greater scalability, 2) hig She efficiency, and 3) enables a large number of different frameworks to efficiently share a cluster. These claims is substantiated both experimentally (via benchmarks)

Advantage:

High scalability
High efficiency of cluster utilization
Many different computing frameworks can share cluster resources

Disadvantages:
1 Each framework does not know the real-time resource usage of the whole cluster, but passively accepts the resources and waits for the top-level dispatching push information.
2 Adopt pessimistic lock, the granularity is small, the response is slightly slow, the lack of an effective competition mechanism.

A client request response process.

?   Step 1: client首先通知RM自己希望提交app?   Step 2: 随后RM会响应一个ApplicationID及关于当前系统资源容量的信息（供client发起资源请求时作为参考）?   Step 3: client回应“Application Submission Context” 及 “Container Launch Context (CLC)”。app submission context 包含Application ID, user, queue, 以及启动AM所需的其他信息；CLC中包含了resource requirements, job files, security tokens，以及在一个node上启动AM所需的其他信息?   Step 4: 当RM收到app submission context后，将会试图为AM调度分配一个可用的container（该container被称为“container 0”，它就是AM，并且它后续将继续请求更多的containers）。如果没有可用的container，该请求将会等待。如果有可用的container，RM会选择并联系一个node，然后在该node上启动AM。并且，用于监控app状态的AM RPC port与tracking URL将被建立起来?   Step 5: RM向AM回送关于集群中maximum and minimum capabilities的信息。此时，AM必须决定怎样来使用这些可用资源。从这里可以看出，YARN允许app适应当前的集群环境?   Step 6: 基于RM在step 5中回送的关于当前集群可用资源的信息，AM将请求若干containers?   Step 7: 随后，RM将根据调度策略对此请求进行回应，并将containers分配给AM

When the job starts running, AM sends the heartbeat/progress information to RM. In these heartbeat messages, am can request more containers and can also release containers. When the job is finished, am sends the finish message to RM and exits.

Reference documents:
Apache Hadoop Yarn:yet another Resource negotiator
Http://www.cnblogs.com/zwCHAN/p/4240539.html
Spark notes 4:apache Hadoop Yarn:yet another Resource negotiator
https://www.zybuluo.com/xtccc/note/248181
YARN Architecture
http://geek.csdn.net/news/detail/74234
Large-size inventory of ten major cluster dispatching systems

Deep analysis of the inside of Hadoop technology yarn architecture design and implementation principles Dong Xicheng

# Apache Hadoop Yarn:yet Another Resource negotiator paper interpretation

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More