I. Overview
Apache hadoop yarn (yet another resource negotiator, another resource Coordinator) is a new hadoop Resource Manager, which is a general resource management system, it can provide unified resource management and scheduling for upper-layer applications. Its Introduction brings huge benefits to cluster utilization, unified resource management, and data sharing.
Yarn was initially designed to solve the obvious deficiencies in mapreduce implementation and improve Scalability (clusters supporting 10 thousand nodes and 0.2 million kernels), reliability, and cluster utilization. Yarn divides two main functions (resource management and Job Scheduling/monitoring) of job tracker into two independent service programs-Global Resource Management (RM) for each application master (AM), such an application is either a mapreduce task in the traditional sense or a directed acyclic graph (DAG) of the task ).
In a sense, yarn should be regarded as a cloud operating system responsible for cluster resource management. Various Applications can be developed on the operating system, such as batch processing mapreduce, streaming job storm, and real-time service storm. These applications can use the computing power and rich data storage models of the hadoop cluster to share data in the same hadoop cluster and on the cluster. In addition, these new frameworks can also use yarn's resource manager to provide new application manager implementations.
Composition of Yarn
The core idea of yarn is to separate jobtracker from tasktacker. It consists of the following components:. A global resource manager ResourceManager B. each node of ResourceManager is represented by nodemanager C. applicationmaster D for each application. each applicationmaster has multiple containers running on nodemanager.
Below is a brief introduction to them:
1. ResourceManager (RM)
Rm is a global resource manager responsible for managing and allocating resources throughout the system. It consists of two components: scheduler and Application Manager (ASM ). The scheduler is based on capacity, queue, and other restrictions (for example, each queue allocates a certain amount of resources and executes a certain number of jobs at most ), allocate resources in the system to various running applications. It should be noted that the scheduler is a "pure scheduler", which no longer engages in any work related to specific applications, such as not being responsible for monitoring or tracking application execution status, it is not responsible for restarting the failed tasks caused by application execution failure or hardware failure, and these tasks are handed over to the application-related applicationmaster. The scheduler allocates resources only according to the resource requirements of each application, and the resource allocation unit is represented by an abstract concept "resource container" (container, A container is a dynamic resource allocation unit. It encapsulates memory, CPU, disk, network, and other resources together to limit the amount of resources used by each task. In addition, the scheduler is a pluggable component. You can design a new scheduler based on your needs. Yarn provides a variety of directly available schedulers, for example, fair scheduler and capacity scheduler.
Application Manager the application manager is responsible for managing all applications in the entire system, this includes application submission, Resource Negotiation with the scheduler to start the applicationmaster, monitoring the running status of the applicationmaster, and restarting it upon failure.
2. applicationmaster (AM)
Each application submitted by the user contains an AM. Its main functions include: negotiate with the RM scheduler to obtain resources (represented by container ); further allocate the obtained tasks to internal tasks (secondary resource allocation), communicate with nm to start/stop tasks, and monitor the running status of all tasks, when the task fails, apply for resources for the task again to restart the task. Currently, yarn comes with two am implementations. One is the instance program distributedshell used to demonstrate the am writing method. It can apply for a certain number of containers to run a shell command or shell script in parallel; the other is the AM-mrappmaster that runs the mapreduce application.
Note: Rm only monitors AM and starts it when am fails to run. rm is not responsible for Fault Tolerance of AM internal tasks, which is completed by AM.
3. nodemanager (Nm) nm is the resource and task manager on each node. On the one hand, it regularly reports the resource usage and running status of each container to RM. On the other hand, it receives and processes various requests from am container startup/stop.
4. container is a resource abstraction in yarn. It encapsulates multidimensional resources on a node, such as memory, CPU, disk, and network. When am applies for resources from RM, rm is the resource returned by AM, which is expressed by container. Yarn assigns a container to each task, and the task can only use the resources described in the container.
Note: 1. the container is different from the slot in mrv1. It is a dynamic resource Division unit and is dynamically generated according to application requirements. 2. Currently, yarn only supports two types of resources: CPU and memory, and uses the lightweight resource isolation mechanism cgroups for resource isolation. The yarn resource management and execution framework is implemented based on the Master/Slave instance-slave-Node Manager (Nm) to run and monitor each node, and to the cluster master-Resource Manager (RM) report the resource availability status. The resource manager finally allocates resources to all applications in the system.
The execution of a specific application is controlled by applicationmaster. applicationmaster is responsible for dividing an application into multiple tasks and coordinating the required resources with the resource manager. once resources are allocated, applicationmaster schedules, executes, and monitors independent application tasks with the Node Manager.
It should be noted that the communication mode of different service components of yarn adopts the event-driven asynchronous concurrency mechanism, which can simplify the system design.
Analysis on the 4yarn Architecture
Centralized ArchitectureA feature of the centralized scheduler (monolithic Schuder) is that resource scheduling and application management functions are all completed in one process. A typical example in the open source field is the implementation of mrv1 jobtracker. The disadvantages of this design are obvious, with poor scalability: first, the cluster size is limited; second, it is difficult to integrate new scheduling policies into existing code. For example, it was previously only supported by mapreduce jobs, to support streaming jobs now, it is very difficult to embed the Scheduling Policies of streaming jobs into central scheduling.
Double-Layer scheduling architectureFor the lack of centralized customer service schedulers, the dual-layer schedulers are a solution that is easy to think of. They can be seen as a divide-and-conquer mechanism or a policy decentralization mechanism: the dual-layer scheduler still retains a simplified centralized resource scheduler, but the scheduling policies related to specific tasks are completed at the bottom of each application scheduler. A typical example of this scheduler is mesos or yarn. The mesos scheduler consists of two parts: the resource scheduler and the Framework (Application) scheduler. The resource scheduler allocates resources in the cluster to each (application ), the Framework (Application) scheduler is responsible for further allocating resources to internal tasks. Users can easily connect a framework or system to mesos. the two-layer scheduler has the following features: the Framework scheduler does not know the resource usage of the entire cluster, but passively accepts resources. The resource Scheduler only pushes available resources to each framework, the framework chooses whether to use or reject these resources. Once the framework receives the new resources, it further allocates the resources to its internal tasks to implement double-layer scheduling. However, this scheduler also has shortcomings, mainly manifested in the following two aspects: 1. Each framework cannot know the real-time resource usage of the entire cluster; pessimistic locks are used, and the concurrency granularity is small.
Detailed introduction can refer to this paper: http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf
SummaryWith the continuous development and improvement of yarn, various types of applications, including short jobs similar to mapreduce and long jobs similar to web services, can be directly deployed and run on yarn, currently, all interfaces provided by yarn are underlying interfaces, which makes it very difficult for users to write and debug applications, for example, application logs that cannot be aggregated and distributed on each node, application lifecycle management is difficult, and a third-party tool is lacking to run an existing system on yarn, only when these problems are well solved can yarn become mature.