Analysis of Hadoop YARN configuration parameters (1) -RM and NM related parameters

Source: Internet
Author: User
Keywords manager default apply
Tags address analysis application apply apply for client configuration configuration parameters

Note that, before configuring these parameters, you should fully understand the meaning of these parameters to prevent mis-allocation to the cluster risks. In addition, these parameters need to be configured in yarn-site.xml.

ResourceManager related configuration parameters

(1) yarn.resourcemanager.address

Parameter Explanation: ResourceManager address exposed to the client. The client through the address submitted to the RM application, kill applications.

Default: $ {yarn.resourcemanager.hostname}: 8032

(2) yarn.resourcemanager.scheduler.address

Parameter Explanation: ResourceManager exposed address of ApplicationMaster. ApplicationMaster through this address to the RM application resources, release of resources.

Default: $ {yarn.resourcemanager.hostname}: 8030

(3) yarn.resourcemanager.resource-tracker.address

Parameter Explanation: ResourceManager address exposed to NodeManager .. The NodeManager reports the heartbeat to the RM through this address, and receives tasks.

Default: $ {yarn.resourcemanager.hostname}: 8031

(4) yarn.resourcemanager.admin.address

Parameter Explanation: ResourceManager to the administrator exposed access address. The administrator sends management commands to the RM through this address.

Default: $ {yarn.resourcemanager.hostname}: 8033

(5) yarn.resourcemanager.webapp.address

Parameter explanation: ResourceManager external web ui address. This address allows users to view various information about the cluster in the browser.

Default: $ {yarn.resourcemanager.hostname}: 8088

(6) yarn.resourcemanager.scheduler.class

Parameter Explanation: Enabled resource scheduler main class. Currently available are FIFO, Capacity Scheduler, and Fair Scheduler.

Defaults:

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

(7) yarn.resourcemanager.resource-tracker.client.thread-count

Parameter Explanation: The number of Handlers handling RPC requests from NodeManager.

Default: 50

(8) yarn.resourcemanager.scheduler.client.thread-count

Parameter Explanation: Handler number of RPC requests from ApplicationMaster.

Default: 50

(9) yarn.scheduler.minimum-allocation-mb / yarn.scheduler.maximum-allocation-mb

Parameter Explanation: A single application can be the minimum / maximum amount of memory resources. For example, set to 1024 and 3072, then run the MapRedce job, each Task can apply for at least 1024MB of memory, up to apply for 3072MB of memory.

Default: 1024/8192

(10) yarn.scheduler.minimum-allocation-vcores / yarn.scheduler.maximum-allocation-vcores

Parameter Explanation: The number of minimum / maximum virtual CPUs that can be claimed. For example, set to 1 and 4, when running MapRedce jobs, each Task can apply for at least 1 virtual CPU, up to 4 virtual CPUs. What is a virtual CPU, you can read my article: "YARN Resource Scheduler Analysis."

Default: 1/32

(11) yarn.resourcemanager.nodes.include-path /yarn.resourcemanager.nodes.exclude-path

Parameter Explanation: List of NodeManager black and white. If you find that there are several NodeManager problems, such as high failure rate, the task of running high failure rate, you can add it to the blacklist. Note that these two configuration parameters can take effect dynamically. (Call a refresh command)

Defaults:""

(12) yarn.resourcemanager.nodemanagers.heartbeat-interval-ms

Parameter Explanation: NodeManager heartbeat interval

Default: 1000 (milliseconds)

2. NodeManager related configuration parameters

(1) yarn.nodemanager.resource.memory-mb

Parameter Explanation: NodeManager total available physical memory. Note that this parameter can not be modified, once set, the entire operation can not be dynamically modified. In addition, the default value of this parameter is 8192MB, even if your machine memory is not enough 8192MB, YARN will be used in accordance with these memory (silly?) Therefore, this value must be configured. However, Apache is already trying to make this parameter dynamically modifiable.

Default: 8192

(2) yarn.nodemanager.vmem-pmem-ratio

Parameter Explanation: The maximum number of virtual memory available for each 1MB of physical memory used.

Default: 2.1

(3) yarn.nodemanager.resource.cpu-vcores

Parameter Explanation: NodeManager The total number of available virtual CPUs.

Default: 8

(4) yarn.nodemanager.local-dirs

Parameter Explanation: Intermediate results stored in place, similar to 1.0 mapred.local.dir. Note that this parameter usually configures multiple directories, which are already allocated disk IO load.

Default: $ {hadoop.tmp.dir} / nm-local-dir

(5) yarn.nodemanager.log-dirs

Parameter Explanation: Log storage address (can be configured multiple directories).

Default: $ {yarn.log.dir} / userlogs

(6) yarn.nodemanager.log.retain-seconds

Parameter Explanation: Maximum time of log on NodeManager (valid when log aggregation function is not enabled).

Default: 10800 (3 hours)

(7) yarn.nodemanager.aux-services

Parameter Explanation: Ancillary services running on NodeManager. Configure Mapreduce_shuffle to run MapReduce

Defaults:""

Original link: http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-resourcemanager-nodemanager/

You may also like:

1. MapReduce to achieve the recommended system

2. Hadoop distributed file system architecture deployment

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.