Yarn Scheduler Scheduler Detailed __yarn

Source: Internet
Author: User

Ideally, our requests for yarn resources should be met immediately, but the actual situation resources are often limited, especially in a very busy cluster, where a request to apply a resource often needs to wait for a period of time to get to the appropriate resource. In yarn, the scheduler is the one responsible for allocating resources to the application. In fact, scheduling itself is a difficult problem, it is difficult to find a perfect strategy to solve all the application scenarios. To this end, yarn provides a variety of scheduler and configurable policies for our choice. one, the choice of scheduler

In yarn, there are three types of dispatchers to choose from: FIFO Scheduler, Capacity scheduler,fairs Cheduler.

FIFO Scheduler the application in the order of submission into a queue, this is an advanced first out of the queue, in the allocation of resources, first of all in the queue for the application of the allocation of resources, to the top of the application requirements to meet the next allocation, and so on.

FIFO Scheduler is the simplest and easiest to understand scheduler, and does not require any configuration, but it does not apply to shared clusters. Large applications can consume all cluster resources, which can cause other applications to be blocked. In a shared cluster, it is better to adopt capacity Scheduler or Fair Scheduler, both of which allow large tasks and small tasks to obtain a certain amount of system resources while submitting.

The following "Yarn Scheduler contrast Diagram" shows the differences between these dispatchers, as you can see from the graph that small tasks can be blocked by large tasks in the FIFO scheduler.

For the capacity scheduler, there is a dedicated queue to run small tasks, but setting up a queue for small tasks can take up some of the cluster resources beforehand, which results in a lag in the execution time of a large task when using the FIFO scheduler.

In the Fair scheduler, we do not need to occupy a certain amount of system resources beforehand, and the fair scheduler will dynamically adjust the system resources for all running jobs. As the following illustration shows, when the first big job commits, only the job is running, at which point it obtains all the cluster resources; When the second small task is submitted, the Fair Scheduler allocates half the resources to the small task, allowing the two tasks to share the cluster resources fairly.

It is important to note that in the following figure Fair scheduler, there is a delay in submitting from the second task to obtaining the resource because it waits for the first task to release the container that it occupies. After the implementation of small tasks will also release their own resources, large tasks and access to all the system resources. The final effect is that the fair scheduler gets a high resource utilization rate and can ensure that the small tasks are completed in time.

Yarn Scheduler comparison diagram:
ii. configuration of Capacity Scheduler (container scheduler) 2.1 Introduction of Container dispatch

The Capacity Scheduler allows multiple organizations to share the entire cluster, and each organization has access to a subset of the computing power of the cluster. By assigning a dedicated queue to each organization, and then assigning a certain cluster resource to each queue, the entire cluster can provide services to multiple organizations by setting up multiple queues. In addition, the queue can be divided vertically, so that a number of members within the organization can share the queue resources, within a queue, resource scheduling is the use of advanced first Out (FIFO) strategy.

Through the above picture, we already know that a job may not use the resources of the entire queue. However, if you run multiple jobs in this queue, if the queue has enough resources, then assign them to the job if the queue has insufficient resources. In fact, the capacity scheduler may still allocate additional resources to this queue, which is the concept of the "resilient queue" (queue elasticity) .

In normal operation, the capacity scheduler does not force the release of container, and when a queue resource is insufficient, the queue can only obtain container resources after the release of other queues. Of course, we can set a maximum resource usage for the queue, so that the queue is too busy to occupy the idle resources, so that other queues can not use these idle resources, this is the "resilient queue" need to weigh the place. 2.2 Configuration of container scheduling

Let's say we have queues at the following levels:

Root
├──prod
└──dev
    ├──eng
    └──science

The following is a simple configuration file for the capacity scheduler, the file name is Capacity-scheduler.xml. In this configuration, two sub queues prod and dev are defined under the root queue, representing 40% and 60% of the capacity respectively. Note that the configuration of a queue is specified by the attribute yarn.sheduler.capacity.<queue-path>.<sub-property>,<queue-path> Represents the inheritance tree of queues, such as the Root.prod queue,<sub-property> generally refers to capacity and maximum-capacity.

As we can see, the dev queue is divided into two identical eng of the same capacity as science and science. Dev's maximum-capacity attribute is set to 75%, so even the prod queue fully idle dev does not consume all cluster resources, that is, the prod queue still has 25% of resources available for contingency. We note that the Eng and science two queues do not have the maximum-capacity attribute set, meaning that the job in Eng or science queues may use all resources (up to 75% of the cluster) for the entire dev queue. Similarly, prod is likely to occupy all of the cluster resources because it does not have the Maximum-capacity attribute set.

In addition to configuring queues and their capacity, the capacity container can also configure the maximum number of resources a user or application can allocate, how many applications can be run at the same time, and the ACL authentication for the queue. 2.3 Setup of Queues

With regard to the setup of queues, this depends on our specific application. For example, in MapReduce, we can specify the queue to use through the Mapreduce.job.queuename property. If the queue does not exist, we will receive an error when submitting the task. If we do not define any queues, all applications will be placed in a default queue.

Note: For the capacity scheduler, our queue name must be the last part of the queue tree and will not be recognized if we use the queue tree. For example, in the above configuration, we can use PROD and eng as queue names, but if we use Root.dev.eng or Dev.eng are not valid. iii. configuration of Fair Scheduler (Fair scheduler) 3.1 Fair Dispatch

The goal of the Fair Scheduler is to assign fair resources to all applications (the definition of fairness can be set by parameters). In the above "yarn Scheduler contrast graph" shows the fair scheduling of two applications in a queue; Of course, fair scheduling can also work across multiple queues. For example, suppose there are two users, A and B, who have a queue. When a starts a job and B does not have a task, a will get all the cluster resources; When B starts a job, A's job will continue to run, but after a while the two tasks will each receive half of the cluster resources. If B starts the second job at this point and the other job is still running, it will share the resources of B for this queue with the first job of B, that is, the two jobs for B will be used for One-fourth of the cluster resources, and a job is still used for half the resources in the cluster. The result is that resources are ultimately shared equally among two of users. The procedure is shown in the following illustration:
3.2 Enable Fair Scheduler

The scheduler is configured using the Yarn.resourcemanager.scheduler.class parameter in the Yarn-site.xml configuration file, and the capacity Scheduler Scheduler is used by default. If we want to use the Fair scheduler, we need to configure the fully qualified name of the Fairscheduler class on this parameter: Org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler. 3.3 Configuration of Queues

The configuration file for the Fair Scheduler is located in the Fair-scheduler.xml file under the Classpath, which can be modified by the Yarn.scheduler.fair.allocation.file property. Without this profile, the Fair Scheduler uses an allocation strategy that is similar to the one described in section 3.1: The scheduler automatically creates a queue for the user when the first application is submitted, the name of the queue is the username, and all applications are assigned to the appropriate user queue.

We can configure each queue in the configuration file, and we can configure the queue hierarchically like the capacity scheduler. For example, refer to Capacity-scheduler.xml to configure Fair-scheduler:

The hierarchy of queues is implemented through nested <queue> elements. All queues are children of the root queue, even if we are not assigned to the <root> element. In this configuration, we divide the dev queue into Eng and science two queues.

The queue in the Fair scheduler has a weighting attribute (this is the definition of fairness), and the attribute is used as a basis for fair dispatch. In this example, when the scheduler allocates the cluster 40:60 resources to prod and Dev, it is considered fair, Eng and the science queues do not define weights, and are evenly distributed. The weight here is not a percentage, we replace the above 40 and 60 respectively to 2 and 3, the effect is the same. Note that for queues that are automatically created by the user when there are no profiles, they still have a weight and a weight value of 1.

There can still be different scheduling policies within each queue. The default scheduling policy for queues can be configured with the top-level element <defaultQueueSchedulingPolicy>, and if not configured, the default is fair scheduling.

Although the fair scheduler, it still supports FIFO scheduling at the queue level. The scheduling strategy for each queue can be covered by its internal <schedulingPolicy> elements, in which case the PROD queue is assigned FIFO scheduling, so the tasks submitted to the PROD queue can be executed in the order of FIFO rules. It needs to be noted that the scheduling between prod and Dev is still fairly scheduled, as is the same Eng and science.

Although not shown in the configuration above, each queue can still be configured with the maximum, minimum resource footprint, and the maximum number of applications that can be run. 3.4 setup of Queues

The Fair Scheduler uses a rule-based system to determine which queues the application should be placed in. In the above example, the,<queueplacementpolicy> element defines a list of rules, each of which is tried one after another until the match succeeds. For example, the first rule specified in the previous example puts the application in the queue it specifies, if the application does not specify a queue name or the queue name does not exist, then the rule is not matched, and then the next rule is attempted. The Primarygroup rule attempts to place the application in a queue named after the user's UNIX group name, without which the queue is created instead of the next rule. When all the rules of the current face are not satisfied, the default rule is triggered and the application is placed in the Dev.eng queue.

Of course, we can not configure the Queueplacementpolicy rules, the Scheduler defaults to the following rules:

<queuePlacementPolicy>
<rule name= "specified"/>
<rule name= "user"/>
</ Queueplacementpolicy>

The above rule can be summed up in a sentence, unless the queue is precisely defined, otherwise the queue will be created with the username name.

There is also a simple configuration policy that allows all applications to be placed in the same queue (default) so that all applications can share the cluster equally rather than among the users. This configuration is defined as follows:

<queuePlacementPolicy>
<rule name= "Default"/>
</queuePlacementPolicy>

To achieve the above function we can also set yarn.scheduler.fair.user-as-default-queue=false directly without using a configuration file, so that the application is placed in the default queue instead of the individual user name queues. In addition, we can set yarn.scheduler.fair.allow-undeclared-pools=false so that users cannot create queues. 3.5 preemption (preemption)

When a job commits to an empty queue in a busy cluster, the job does not execute immediately, but blocks until the running job frees the system resources. In order to make the execution time of the commit job more predictable (you can set the timeout for the wait), the Fair Scheduler supports preemption.

Preemption is to allow the scheduler to kill the containers that occupy more than its share of the resource queue, and these containers resources can be allocated to the queues that should have these share resources. It is necessary to note that preemption reduces the efficiency of the cluster execution because the terminated containers need to be executed again.

You can enable preemption by setting a global parameter yarn.scheduler.fair.preemption=true. In addition, two parameters are used to control the expiration time of preemption (these two parameters are not configured by default and need to be configured at least one to allow preemption of container):

-Minimum share preemption timeout
-Fair share preemption

The scheduler will preempt containers if the queue does not receive the minimum resource guarantee within the time specified by the minimum share preemption timeout. We can configure this timeout for all queues through the top-level elements in the configuration file <defaultMinSharePreemptionTimeout>. We can also configure < within the <queue> element minsharepreemptiontimeout> element to specify a time-out period for a queue.

Similarly, if the queue is half fair share preemption timeout a specified time without equal resources (this ratio can be configured), the scheduler will preempt the containers. This timeout can be achieved through top-level element <defaultFairSharePreemptionTimeout> and element level elements <fairSharePreemptionTimeout> Configure the timeout for all queues and a queue, respectively. The proportions mentioned above can be achieved by <defaultFairSharePreemptionThreshold> (configuring all Queues) and <fairSharePreemptionThreshold> (Configuring a queue) To configure, the default is 0.5.

Other articles related to yarn:

Hadoop Yarn Detailed
Yarn memory allocation management mechanism and related parameter configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.