Hadoop Jira Links: https://issues.apache.org/jira/browse/YARN-3
Scope of ownership (new features, improvements, optimizations, or bugs): new features
Repair version: 2.0.3-alpha and above version
Subordinate branch (Common, HDFS, YARN or mapreduce): YARN
Involved modules: NodeManager
English title: "Add support for CPU isolation/monitoring of containers"
Background information
Yarn as a resource management system, consisting mainly of two components, respectively, ResourceManager and NodeManager, where ResourceManager is responsible for the management and allocation of resources throughout the cluster, While NodeManager is responsible for individual node resource management and task start-up, these two components must fully play their role in order to complete the effective use of resources, indispensable. ResourceManager allocates resources to the applicationmaster of the application, such as allocating resources <node1,1cpu, 2gb> to AppMaster1, While AppMaster1 will further communicate with NodeManager on Node1, starting a task that takes up 1CPU and 2GB of memory, and in order to ensure that the task "accounts for and only" these resources, NodeManager must provide a reasonable isolation mechanism, Provides a resource container that guarantees these resources, and also prevents it from consuming resources to interfere with other tasks.
In contrast, MRV1 uses a JVM for resource isolation, while the JVM can only qualify memory resources, and other resources, including CPUs, networks, and so on, cannot be isolated. In resource isolation, yarn or MRv1 to be much more advanced.
Solution
To provide resource isolation mechanism is the responsibility of yarn NodeManager, for different resources, yarn adopts different resource isolation mechanism, and the YARN-3 in this paper comprehensively introduces yarn resource isolation mechanism, sums up The current yarn provides an isolation mechanism for both CPU and memory resources, where the CPU uses a cgroups lightweight resource isolation mechanism, while memory uses a thread-monitoring scheme.
Since yarn's goal is to build a common resource management platform that is not limited to applications such as Java-written mapreduce and more Java programs, it is not feasible to MRv1 a resource isolation scheme based on the JVM.
For cgroups, it can limit the application's memory usage limit, and it will kill it directly when memory exceeds a threshold value. For some applications, in some cases, when memory bursts and dips occur, the inflexible policy is inflexible, and based on this, yarn still employs a thread-monitoring scheme in MRV1, which starts a thread to monitor the process tree of the currently running task. If you find that memory is exploding and plunging, it is considered normal and does not kill the task, so the scheme is friendlier.
Because the number of CPU resources will not affect the task of life and death (only affect the speed of task execution), therefore, yarn used cgroups CPU resource isolation, it is necessary to note that cgroups is using the lower limit of CPU resources control method, this method is a fair sharing method, for example, If you have 8 cores (pcore:vcore=1:1) on one node, then if you run only one task (pcore=1), it uses 800% of the CPU, and if you run 2 tasks (pcore=1), you can use up to 400% CPUs per task, and so on.
Currently, there are many areas of yarn resource isolation that need to be improved, such as support for finer-grained resource isolation, such as binding tasks to a CPU (already done, using the Taskset command), and supporting more types of resource isolation. such as network and disk IO (this relies on the development of Cgroups, the current cgoups is not perfect in this respect).
How to configure.
The "note" configuration parameter was introduced in Https://issues.apache.org/jira/browse/YARN-2. This part of the content I have in my blog article "yarn/mrv2 ResourceManager in-depth analysis-the resource Scheduler" in detail.
The current yarn supports the management and allocation of two types of resources, memory and CPU. When NodeManager is started, it registers with the ResourceManager, and the registration information contains the total amount of CPU and memory that the node can allocate, both of which can be set through configuration options (in the Yarn-site.xml file), as follows:
(1) YARN.NODEMANAGER.RESOURCE.MEMORY-MB
The total amount of physical memory that the node can allocate, by default, 8*1024MB.
(2) Yarn.nodemanager.vmem-pmem-ratio
The total amount of virtual memory per unit of physical RAM, by default, is 2.1, which means that you can use up to 2.1MB of total virtual memory for each 1MB of physical memory.
(3) Yarn.nodemanager.resource.cpu-core (default is 8)
The total number of CPUs that can be allocated by default is 8
(4) Yarn.nodemanager.vcores-pcores-ratio
For finer granularity of dividing CPU resources, yarn divides each physical CPU into several virtual CPUs, with a default value of 2. When a user submits an application, you can specify the number of virtual CPUs that each task requires. In Mrappmaster, the number of virtual CPUs required by default for each map task and reduce task is 1, Users can be modified individually through mapreduce.map.cpu.vcores and mapreduce.reduce.cpu.vcores (for memory resources, the map task and reduce task require 1024MB by default, Users can be modified by MAPREDUCE.MAP.MEMORY.MB and MAPREDUCE.REDUCE.MEMORY.MB respectively.
(in the latest version, Yarn.nodemanager.resource.cpu-core and Yarn.nodemanager.vcores-pcores-ratio two parameters were abandoned, introducing a new parameter Yarn.nodemanager.resource.cpu-vcore , to indicate the number of virtual CPUs, please read YARN-782
To enable cgroups and memory thread monitoring, you can configure it by following the documentation "Hadoop MapReduce Next generation–cluster Setup" and be sure to read this article first: Using YARN with Cgroups.
Extended reading:
(1) "Hook up cgroups CPU settings to the number of virtual cores allocated": https://issues.apache.org/jira/browse/YARN-600
(2) "Cgroupslceresourceshandler tries to write to Cgroup.procs": https://issues.apache.org/jira/browse/YARN-799
(3) "Support Cgroup ceiling enforcement on CPU": https://issues.apache.org/jira/browse/YARN-810
Original articles, reproduced please specify: Reprinted from Dong's Blog
This article link address: http://dongxicheng.org/mapreduce-nextgen/hadoop-jira-yarn-3/
Author: Dong, author Introduction: http://dongxicheng.org/about/
A collection of articles on this blog: http://dongxicheng.org/recommend/