Hadoop,yarn and Vcpus resource configuration

Source: Internet
Author: User

Hadoop yarn supports both memory and CPU scheduling of two resources (only memory is supported by default, if you want to schedule the CPU further and you need to do some configuration yourself), this article describes how yarn is scheduling and isolating these resources.

In yarn, resource management is done jointly by ResourceManager and NodeManager, where the scheduler in ResourceManager is responsible for allocating resources, while NodeManager is responsible for the supply and isolation of resources. ResourceManager you assign a resource on a NodeManager to a task (this is called "Resource Scheduling"), NodeManager needs to provide the appropriate resources for the task as required, or even guarantee that the resources should be exclusive and provide the basis for the task to run. This is known as resource isolation.

For a detailed introduction to the Hadoop YARN Resource Scheduler, refer to my article: Yarn/mrv2 Resource Manager in-depth profiling-resource scheduler.

Before formally introducing specific resource scheduling and isolation, take a look at the characteristics of both memory and CPU resources, which are two different kinds of resources. The amount of memory resources will determine the life and death of the task, if the memory is not enough, the task may fail to run, in contrast, the CPU resources are different, it will only determine the speed of the task, it will not affect the life and death.

the YARN scheduling and isolation of in-memory resources

Based on the above considerations, yarn allows the user to configure the physical memory resources available on each node, noting that this is "available" because the memory on one node is shared by several services, such as part to yarn, part to HDFs, part to HBase, etc. Yarn is configured only for its own use, the configuration parameters are as follows:

(1) YARN.NODEMANAGER.RESOURCE.MEMORY-MB

Indicates the total amount of physical memory that yarn can use on the node, by default 8192 (MB), and note that if your node's memory resources are not 8GB, you need to reduce this value, and yarn will not intelligently probe the total physical memory of the node.

(2) Yarn.nodemanager.vmem-pmem-ratio

Each task uses 1MB of physical memory and can use up to 2.1 of the virtual amount.

(3) yarn.nodemanager.pmem-check-enabled

Whether to start a thread that checks the amount of physical memory that each task is using, and if the task exceeds the assigned value, it is killed directly and is true by default.

(4) yarn.nodemanager.vmem-check-enabled

Whether to start a thread that checks the amount of virtual memory that each task is using, and if the task exceeds the assigned value, it is killed directly and is true by default.

(5) YARN.SCHEDULER.MINIMUM-ALLOCATION-MB

The minimum amount of physical memory that a single task can request, which by default is 1024x768 (MB), and if the amount of physical memory requested by a task is less than that value, the corresponding value is changed to this number.

(6) YARN.SCHEDULER.MAXIMUM-ALLOCATION-MB

The maximum amount of physical memory that a single task can request, which by default is 8192 (MB).

By default, yarn uses a thread-monitoring method to determine if a task is using memory in excess and kills it directly if it finds excess. Due to the lack of flexibility in the memory control of the cgroups (that is, the task cannot exceed the memory limit at any time, if it is exceeded, it is killed or reported to Oom directly), and the Java process will double in the creation of the instant memory, then plunge to normal, in which case, Thread monitoring is more flexible (when it is found that the process tree memory instantly doubles beyond the set value, it can be considered normal and does not kill the task), so yarn does not provide a cgroups memory isolation mechanism.

the YARN Medium CPU scheduling and isolation of resources "

In yarn, the organization of CPU resources is still in the exploration, the current (2.2.0 version) is only a preliminary, very coarse-grained implementation, more granular CPU division has been proposed, is being perfected and implemented.

The current CPU is divided into virtual CPUs (CPU virtual Core), where the virtual CPU is yarn itself introduced the concept, the original intention is that, considering the different nodes CPU performance may be different, each CPU has the same computing power, For example, a physical CPU might be twice times more computationally capable than another physical CPU, and you can compensate for this difference by configuring several virtual CPUs for the first physical CPU. When a user submits a job, you can specify the number of virtual CPUs that each task requires. In yarn, the CPU-related configuration parameters are as follows:

(1) Yarn.nodemanager.resource.cpu-vcores

Indicates the number of virtual CPUs that yarn can use on the node, by default, 8, and it is recommended that the value be set to the same number as the physical CPU cores. If your node has less than 8 CPU cores, you need to reduce this value, and yarn will not intelligently probe the total number of physical CPUs of a node.

(2) Yarn.scheduler.minimum-allocation-vcores

The minimum number of virtual CPUs that a single task can request, the default is 1, and the corresponding value is changed to this number if the number of CPUs for a task request is less.

(3) Yarn.scheduler.maximum-allocation-vcores

The maximum number of virtual CPUs that a single task can request, by default, 32.

By default, yarn is not scheduled for CPU resources, you need to configure the appropriate resource scheduler for your support, specifically, refer to my two articles:

(1) Hadoop yarn configuration parameters Profiling (4)-fair Scheduler related parameters

(2) Hadoop yarn configuration parameters Profiling (5)-capacity Scheduler related parameters

By default, NodeManager does not isolate any CPU resources, and you can enable cgroups to allow you to support CPU isolation.

Due to the uniqueness of CPU resources, the current method of CPU allocation is still coarse-grained. For example, many tasks may be IO-intensive, consume very little CPU resources, if you assign it a CPU at this time, it is a serious waste, you can let him with a few other tasks a CPU, that is to say, we need to support more granular CPU expression.

Referring to the partitioning of CPU resources in Amazon EC2, it is proposed that the minimum CPU unit is EC2 Compute Unit (ECU), and an ECU represents the processing power equivalent to the 1.0-1.2 GHz Opteron or Xeon processor. Yarn presents the CPU minimum unit yarn Compute Unit (YCU), currently this number is an integer, the default is 720, set by the parameter Yarn.nodemanager.resource.cpu-ycus-per-core, represents a CPU The computing power of the core (the feature does not exist in version 2.2.0, may be increased to 2.3.0), so that when the user submits the job, the required YCU can be specified directly, for example, the specified value is 360, which means that with 1/2 CPU cores, the actual performance is, Use only one CPU core for 1/2 compute time. Note that at the operating system level, CPU resources are allocated according to time slices, and you can say that a process uses 1/3 of the CPU time slices, or 1/5 of the time slices. For a discussion of CPU resource partitioning and scheduling, refer to the following links:

https://issues.apache.org/jira/browse/YARN-1089

https://issues.apache.org/jira/browse/YARN-1024

Hadoop new features, improvements, optimizations, and bug Analysis series 5:yarn-3

Summary

At present, yarn memory resource scheduling reference in the way of Hadoop 1.0, more reasonable, but the CPU resource scheduling mode is still improving, is only a preliminary rough implementation, I believe in the near future, YARN in the CPU resource scheduling will be more perfect.

original articles, reproduced please specify: reproduced from Dong's blog

Hadoop,yarn and Vcpus resource configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.