Yarn Memory Configuration Guide

Last Update:2018-08-12 Source: Internet

Author: User

Tags python script

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Yarn requires a lot of memory configuration, this article only gives some recommendations and suggestions, actually according to the specific business logic to set

First, it needs to be clear that in yarn, the entire cluster of resources requires memory, hard disk, CPU (CPU core number) Three to decide, must realize the balance of three, in the actual production environment, hard disk is large enough, so rarely consider the hard drive, here for the time being a hard disk as a factor as a reference.

When computing the memory of a node, you need to consider the memory requirements of the operating system, NM memory requirements, and the memory requirements of other systems on that node (for example, HBase, below, for example, HBase),

So yarn available memory = Total system memory-preserve memory for the operating system-preserves memory for HBase

The operating system and HBase memory reference values are as follows

Node Total memory	Memory reserved by the operating system	Memory reserved by hbase
4 GB	1 GB	1 GB
8 GB	2 GB	1 GB
GB	2 GB	2 GB
GB	4 GB	4 GB
GB	6 GB	8 GB
GB	8 GB	8 GB
GB	8 GB	8 GB
GB	GB	GB
128 GB	GB	GB
256 GB	Gb	Gb
GB	GB	GB

Then, the maximum number of containers per node can be calculated by using the following formula

Containers=min (2*cpu,1.8disks, yarn free memory)/container minimum memory)

Each container minimum memory is dependent on the yarn available memory, and the minimum memory and available memory relationships are as follows:

Available memory per node	Container Minimum memory recommended value
Less than 4 GB	256 MB
Between 4 GB and 8 GB	Mb
Between 8 GB and GB	1024 MB
Above GB	2048 MB

According to the above reference values and calculation formulas, we can calculate the number of nodes container, then each container can use the memory can be obtained by the following formula

Each container memory =max (container minimum memory, yarn number of available memory/container)

With the above calculations, the yarn and Mr Memory recommendations are configured as follows:

Configuration file	Configuration Item Name	Configuration Item Value
Yarn-site.xml	Yarn.nodemanager.resource.memory-mb	= containers number * Each container memory
Yarn-site.xml	Yarn.scheduler.minimum-allocation-mb	= per container memory
Yarn-site.xml	Yarn.scheduler.maximum-allocation-mb	= containers number * Each container memory
Mapred-site.xml	Mapreduce.map.memory.mb	= per container memory
Mapred-site.xml	Mapreduce.reduce.memory.mb	= 2 * per container memory
Mapred-site.xml	Mapreduce.map.java.opts	= 0.8 * per container memory
Mapred-site.xml	Mapreduce.reduce.java.opts	= 0.8 * 2 * per container memory
Yarn-site.xml (check)	Yarn.app.mapreduce.am.resource.mb	= 2 * per container memory
Yarn-site.xml (check)	Yarn.app.mapreduce.am.command-opts	= 0.8 * 2 * per container memory

HDP also publishes a Python script yarn-util.py to simplify the calculation, which has four parameters

Parameters	Describe
-C Cores	CPU cores per node
-M MEMORY	Total memory per node (unit g)
-D Disks	Number of hard disks per node
-K HBASE	True if HBase is installed, or false

For example 16 nuclear CPU, 64G memory, 4 hard disk, not installed HBase, its calculation recommended configuration is as follows

Using cores=16 MEMORY=64GB disks=4 hbase=false
Profile:cores=16 MEMORY=57344MB RESERVED=8GB USABLEMEM=56GB disks=4
Num container=8
Container RAM=7168MB
Used RAM=56GB
Unused RAM=8GB
yarn.scheduler.minimum-allocation-mb=7168
yarn.scheduler.maximum-allocation-mb=57344
yarn.nodemanager.resource.memory-mb=57344
mapreduce.map.memory.mb=7168
mapreduce.map.java.opts=-xmx5734m
mapreduce.reduce.memory.mb=7168
mapreduce.reduce.java.opts=-xmx5734m
yarn.app.mapreduce.am.resource.mb=7168
yarn.app.mapreduce.am.command-opts=-xmx5734m
mapreduce.task.io.sort.mb=2867

This script downloads the address yarn-util.py

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More