Note that before you configure these parameters, you should fully understand the implications of these parameters in order to prevent the pitfalls caused by the misuse of the cluster. In addition, these parameters need to be configured in Yarn-site.xml.
1. ResourceManager Related configuration parameters
(1) yarn.resourcemanager.address
Parameter explanation: The address that the ResourceManager exposes to the client. The client submits the application to RM via this address, kills the application, and so on.
Default value: ${yarn.resourcemanager.hostname}:8032
(2) Yarn.resourcemanager.scheduler.address
Parameter explanation: ResourceManager access address to applicationmaster exposure. Applicationmaster uses this address to request resources from RM, release resources, and so on.
Default value: ${yarn.resourcemanager.hostname}:8030
(3) Yarn.resourcemanager.resource-tracker.address
Parameter explanation: ResourceManager address to NodeManager exposure. NodeManager through this address to the RM report heartbeat, pick up the task and so on.
Default value: ${yarn.resourcemanager.hostname}:8031
(4) Yarn.resourcemanager.admin.address
Parameter explanation: ResourceManager access address that is exposed to an administrator. Administrators send administrative commands to RM through this address.
Default value: ${yarn.resourcemanager.hostname}:8033
(5) Yarn.resourcemanager.webapp.address
Parameter explanation: ResourceManager external Web UI address. This address allows users to view clusters of information in a browser.
Default value: ${yarn.resourcemanager.hostname}:8088
(6) Yarn.resourcemanager.scheduler.class
Parameter explanation: The resource Scheduler main class that is enabled. Currently available are FIFO, Capacity Scheduler and fair Scheduler.
Default value:
Org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(7) Yarn.resourcemanager.resource-tracker.client.thread-count
Parameter interpretation: The number of handler processing RPC requests from NodeManager.
Default value: 50
(8) Yarn.resourcemanager.scheduler.client.thread-count
Parameter interpretation: The number of handler processing RPC requests from Applicationmaster.
Default value: 50
(9) YARN.SCHEDULER.MINIMUM-ALLOCATION-MB/YARN.SCHEDULER.MAXIMUM-ALLOCATION-MB
Parameter explanation: The minimum/maximum amount of memory resources that can be requested per single. For example, set to 1024 and 3072, when running the MAPREDCE job, each task can request at least 1024MB of memory, up to 3072MB memory.
Default value: 1024/8192
(a) Yarn.scheduler.minimum-allocation-vcores/yarn.scheduler.maximum-allocation-vcores
Parameter explanation: The minimum/maximum number of virtual CPUs that can be applied individually. For example, set to 1 and 4, when running the MAPREDCE job, each task can request at least 1 virtual CPUs, up to 4 virtual CPUs. What is a virtual CPU, you can read my article: "YARN Resource Scheduler Profiler."
Default value: 1/32
(one) Yarn.resourcemanager.nodes.include-path/yarn.resourcemanager.nodes.exclude-path
Parameter explanation: NodeManager black and white list. If a number of NodeManager are found to be problematic, such as a high failure rate and a high failure rate for a task, you can add it to the blacklist. Note that these two configuration parameters can take effect dynamically. (Invoke a Refresh command)
Default value: ""
(YARN.RESOURCEMANAGER.NODEMANAGERS.HEARTBEAT-INTERVAL-MS)
Parameter explanation: NodeManager heartbeat interval
Default value: 1000 (ms)
2. NodeManager Related configuration parameters
(1) YARN.NODEMANAGER.RESOURCE.MEMORY-MB
Parameter explanation: NodeManager total available physical memory. Note that this parameter is not modifiable, and once set, the entire operation cannot be dynamically modified. In addition, the default value of this parameter is 8192MB, even if your machine memory is not enough 8192mb,yarn will also follow these memory to use (silly not silly?), therefore, this value must be configured. However, Apache is already trying to make this parameter dynamically modifiable.
Default value: 8192
(2) Yarn.nodemanager.vmem-pmem-ratio
Parameter explanation: The maximum number of virtual memory available for each use of 1MB physical memory.
Default value: 2.1
(3) Yarn.nodemanager.resource.cpu-vcores
Parameter explanation: NodeManager Total number of virtual CPUs available.
Default value: 8
(4) Yarn.nodemanager.local-dirs
Parameter explanation: The intermediate result is stored in a position similar to the Mapred.local.dir in 1.0. Note that this parameter typically configures multiple directories and the disk IO load is allocated.
Default value: ${hadoop.tmp.dir}/nm-local-dir
(5) Yarn.nodemanager.log-dirs
Parameter explanation: Log store address (multiple directories can be configured).
Default value: ${yarn.log.dir}/userlogs
(6) Yarn.nodemanager.log.retain-seconds
Parameter explanation: The maximum amount of time on the NodeManager log (valid when the log aggregation function is not enabled).
Default: 10800 (3 hours)
(7) Yarn.nodemanager.aux-services
Parameter explanation: The ancillary services running on the NodeManager. You need to configure Mapreduce_shuffle to run the MapReduce program
Default value: ""