In the general situation ( a ) , the main simple introduction of Yarn , and today spend some time on some specific modules to present the following Yarn 's overall situation, to help you better understand yarn.
1) ResourceManager
In Yarn 's overall architecture, he is also using the master/slave architecture, his Slave is NodeManager ,RM plays a very important role in Yarn, and he is responsible for the unified management and distribution of all the resources in the cluster. He reports information based on the resources of each NM, and distributes the information according to certain policies to each application. The following is The main internal structure of ResourceManager:
(1). user Interaction Module (2). NM Management Module (3). AM Management Module
(4). Application Application Management module
(5). Security Management Module
(6). Resource Allocation module
(7). State machine Management module, ResourceManager use finite state machines to maintain the life cycle of a state, such as Rmapp application state machine, Rmconta container state machine. 1.ResourceManager Event handling
In Yarn , the event-driven mechanism is widely used, and the 1 Center event Dispatcher is used to transfer incoming events, which can greatly improve the efficiency.
2) Resource scheduling model
In MRV1 , resource scheduling defaults to a simple FIFO approach, but in an increasingly diversified environment, this approach is less than perfect, The resource scheduler for multi-user is then present. The main 2 kinds of design ideas.
(1). virtual multiple in the same cluster Hadoop clusters, these Hadoop the cluster has a full set of Hadoop Service, typical representative HOD (Hadoop on Demand) Scheduler.
(2). resource schedulers are implemented as multi-user queues, each of which has its own assigned resources and tasks, as long as each queue runs a similar user base. But they are resources that share a whole set of Hadoop .
3) NodeManager
NM is the agent on a single node in Yarn. He has the following functions
1. Communication with ResourceManager
2. Managing The life cycle of contain containers
3. Monitoring The use of contain resources
4. Manage Node health status
5. Managing Log Services
Like RM ,NM also takes an event-driven form to control the entire process, with many event types changing the object's state machine after being processed by the event handler, altering the object's declaration cycle.
3) run a variety of computational frameworks on Yarn
yarn Span style= "Font-family:times New Roman" >mrv1 place. As a resource framework, there are 2 Kind of thing you have to rewrite the client applicationmaster that is compatible with the frame to be accessed. client
1.MRV1 is mainly achieved through jobtracker and tasktracker , so he is in Yarn Deployment can be the same as the image below.
2.Storm, how does thereal-time processing framework work in Yarn ,Storm and MRV1 are very similar, by Nimbus,supervisor, add one in the middle Zookeeper do the coordination service can be realized. The simulation diagram is as follows.
Of course, the above are also design ideas, and ultimately through their own implementation of the client and custom Applicationmaster to achieve the final design.
4) Yarn 's future
In recent years, a number of general-purpose computational frameworks similar to Yarn have been derived, such as Apache Mesos, which can also support MapReduce and Storm, some of thefeatures and Yarn are still somewhat different. But the framework is not yet particularly stable at present. As a general-purpose computational framework , there are some drawbacks to yarn , for example, because the computational framework running on yarn is resource-segregated, Therefore, the framework is not aware of the overall cluster of resources running situation, it is not good for the overall resource scheduling. Each calculation is the framework does not know when the node is very busy should be waiting for their task to run out of the following tasks, or to the system to request new resources, if the system is also very busy at this time, obviously not suitable for the request, but to wait longer, if the system resources at this time is very empty, then this is the correct strategy.
Yarn Architecture Basic Overview (II)