Note that, before configuring these parameters, you should fully understand the meaning of these parameters to prevent mis-allocation to the cluster risks. In addition, these parameters need to be configured in yarn-site.xml.
ResourceManager related configuration parameters
(1) yarn.resourcemanager.address
Parameter Explanation: ResourceManager address exposed to the client. The client through the address submitted to the RM application, kill applications.
Default: $ {yarn.resourcemanager.hostname}: 8032
(2) yarn.resourcemanager.scheduler.address
Parameter Explanation: ResourceManager exposed address of ApplicationMaster. ApplicationMaster through this address to the RM application resources, release of resources.
Default: $ {yarn.resourcemanager.hostname}: 8030
(3) yarn.resourcemanager.resource-tracker.address
Parameter Explanation: ResourceManager address exposed to NodeManager .. The NodeManager reports the heartbeat to the RM through this address, and receives tasks.
Default: $ {yarn.resourcemanager.hostname}: 8031
(4) yarn.resourcemanager.admin.address
Parameter Explanation: ResourceManager to the administrator exposed access address. The administrator sends management commands to the RM through this address.
Default: $ {yarn.resourcemanager.hostname}: 8033
(5) yarn.resourcemanager.webapp.address
Parameter explanation: ResourceManager external web ui address. This address allows users to view various information about the cluster in the browser.
Default: $ {yarn.resourcemanager.hostname}: 8088
(6) yarn.resourcemanager.scheduler.class
Parameter Explanation: Enabled resource scheduler main class. Currently available are FIFO, Capacity Scheduler, and Fair Scheduler.
Defaults:
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(7) yarn.resourcemanager.resource-tracker.client.thread-count
Parameter Explanation: The number of Handlers handling RPC requests from NodeManager.
Default: 50
(8) yarn.resourcemanager.scheduler.client.thread-count
Parameter Explanation: Handler number of RPC requests from ApplicationMaster.
Default: 50
(9) yarn.scheduler.minimum-allocation-mb / yarn.scheduler.maximum-allocation-mb
Parameter Explanation: A single application can be the minimum / maximum amount of memory resources. For example, set to 1024 and 3072, then run the MapRedce job, each Task can apply for at least 1024MB of memory, up to apply for 3072MB of memory.
Default: 1024/8192
(10) yarn.scheduler.minimum-allocation-vcores / yarn.scheduler.maximum-allocation-vcores
Parameter Explanation: The number of minimum / maximum virtual CPUs that can be claimed. For example, set to 1 and 4, when running MapRedce jobs, each Task can apply for at least 1 virtual CPU, up to 4 virtual CPUs. What is a virtual CPU, you can read my article: "YARN Resource Scheduler Analysis."
Default: 1/32
(11) yarn.resourcemanager.nodes.include-path /yarn.resourcemanager.nodes.exclude-path
Parameter Explanation: List of NodeManager black and white. If you find that there are several NodeManager problems, such as high failure rate, the task of running high failure rate, you can add it to the blacklist. Note that these two configuration parameters can take effect dynamically. (Call a refresh command)
Defaults:""
(12) yarn.resourcemanager.nodemanagers.heartbeat-interval-ms
Parameter Explanation: NodeManager heartbeat interval
Default: 1000 (milliseconds)
2. NodeManager related configuration parameters
(1) yarn.nodemanager.resource.memory-mb
Parameter Explanation: NodeManager total available physical memory. Note that this parameter can not be modified, once set, the entire operation can not be dynamically modified. In addition, the default value of this parameter is 8192MB, even if your machine memory is not enough 8192MB, YARN will be used in accordance with these memory (silly?) Therefore, this value must be configured. However, Apache is already trying to make this parameter dynamically modifiable.
Default: 8192
(2) yarn.nodemanager.vmem-pmem-ratio
Parameter Explanation: The maximum number of virtual memory available for each 1MB of physical memory used.
Default: 2.1
(3) yarn.nodemanager.resource.cpu-vcores
Parameter Explanation: NodeManager The total number of available virtual CPUs.
Default: 8
(4) yarn.nodemanager.local-dirs
Parameter Explanation: Intermediate results stored in place, similar to 1.0 mapred.local.dir. Note that this parameter usually configures multiple directories, which are already allocated disk IO load.
Default: $ {hadoop.tmp.dir} / nm-local-dir
(5) yarn.nodemanager.log-dirs
Parameter Explanation: Log storage address (can be configured multiple directories).
Default: $ {yarn.log.dir} / userlogs
(6) yarn.nodemanager.log.retain-seconds
Parameter Explanation: Maximum time of log on NodeManager (valid when log aggregation function is not enabled).
Default: 10800 (3 hours)
(7) yarn.nodemanager.aux-services
Parameter Explanation: Ancillary services running on NodeManager. Configure Mapreduce_shuffle to run MapReduce
Defaults:""
Original link: http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-resourcemanager-nodemanager/
You may also like:
1. MapReduce to achieve the recommended system
2. Hadoop distributed file system architecture deployment