Analysis of YARN ResourceManager Scheduler

Last Update:2018-08-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Yarn is the resource control framework in the new Hadoop version. The purpose of this paper is to analyze the scheduler of ResourceManager, discuss the design emphases of three kinds of scheduler, and finally give some configuration suggestions and parameter explanations.

This paper is based on CDH4.2.1. Scheduler This section is still in rapid change. For example, features such as CPU resource allocation will be added in the future.

For easy access to the source code, the original code location is represented using the [class name: line number].

Noun Explanation:

ResourceManager: hereinafter referred to as RM. Yarn Control module, responsible for the unified planning of the use of resources.

NodeManager: hereinafter referred to as NM. Yarn Resource Node module, responsible for starting the management container.

Applicationmaster: hereinafter referred to as AM. Each application in yarn starts with an AM, is responsible for requesting resources from RM, requests NM to start container, and tells container what to do.

Container: Resource container. All applications in yarn are run on top of container. Am is also running on container, but the container of AM is an RM application.

Scheduler in 1.RM

ResourceManager is the central module of the Yarn Resource control framework, which is responsible for the unified management and distribution of all the resources in the cluster. It receives reports from NM, establishes am, and sends resources to AM. The overall RM framework can refer to: RM Overall architecture

The original Hadoop version was only Fifoscheduler (first-in first Out scheduler). When the Hadoop cluster is used on a large scale, how to integrate resources and allocate resources is an urgent need. In this respect, Yahoo! and Facebook have developed capacityscheduler (capacity Scheduler) and Fairscheduler (Fair Scheduler). In the new version, the two dispatchers were also restarted on the basis of maintaining the core algorithm. (In fact, all the code for the entire yarn is rewritten ...) ）

2. Interface of Dispatcher

First, look at the way the scheduler works and externally exposed interfaces: [RESOURCESCHEDULER:36]

A complete scheduler maintains a queue, application, and nm,container relationship in memory. At the same time, a scheduler is also an event handler that knows what is happening outside through the asynchronous event invocation mechanism of RM. You want to send the corresponding event to the external interaction as well. The scheduler handles 6 scheduling events altogether.

event	send time	processing logic
node_added	a nm added	increase the size of the total resource pool and modify the memory state.
node_removed	a nm removed	remove a nm, reduce the total resource pool size, reclaim the memory state of the container on this nm, Send a Kill event for each container.
node_update	a nm and RM heartbeat	Scheduler assigns a contai to an am on this nm based on the current NM condition NER, and record this container information, left to AM to get. This section is the part where the scheduler really allocates container. will focus on the description later.
app_added	A new application is submitted	if the application is accepted, send the App_accepted event, or send the App_reject Ed event.
app_removed	An application remove	may be normal or be killed. Clears all container of this application in memory and sends a Kill event for each container. The
container_expired	A CONTAINER expiration is not used	modifies the memory state associated with this contianer.

In addition to these 6 events, a function is invoked when the heartbeat of AM and RM is called. [yarnscheduler:105]

Allocation Allocate (Applicationattemptid appattemptid,list<resourcerequest> ask,list<containerid> release);

Am tells the scheduler about the requests for resources and the list of container that have been used, and then gets the allocation to the container that is already assigned to the application in Node_update.

You can see that the scheduler accepts requests for resources and allocates resources. This action is asynchronous.

3. Resource Allocation model

Both Fifoscheduler,capacityscheduler and Fairscheduler's core resource allocation models are the same.

The scheduler maintains information about a group of queues. Users can submit applications to one or more queues. Each time the NM heartbeat, the scheduler, according to a certain number of rules to select a queue, and then select an application on the queue, try to allocate resources on this application. However, because some parameters limit the allocation failure, the next application continues to be selected. After selecting an application, the application will also have a lot of requests for resources. The Scheduler prioritizes the local resource for the request, followed by the same rack, and the last arbitrary machine.

In general, the 3 scheduler is a question of how to select a queue and how to select an application on a queue.

Of course, in fact, compared to the simple fifoscheduler. Capacityscheduler and Fairscheduler have more interesting and exciting features.

4. Scheduler comparison

Let's compare the next 3 types of schedulers.

scheduler	Fifoscheduler	Capac Ityscheduler	Fairscheduler
Design Purpose	Simplest scheduler, easy to understand and hands-on	Multiuser scenarios maximize cluster throughput and utilization

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis of YARN ResourceManager Scheduler

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Analysis of YARN ResourceManager Scheduler

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support