Spark on yarn memory allocation problem _spark

Source: Internet
Author: User
Problem description

When you tested spark on yarn, you found some memory allocation problems, as follows.

Configure the following parameters in $spark_home/conf/spark-env.sh:

spark_executor_instances=4 number of EXECUTOR processes initiated in the yarn cluster

SPARK_EXECUTOR_MEMORY=2G The amount of memory allocated for each EXECUTOR process

SPARK_DRIVER_MEMORY=1G size of memory allocated for Spark-driver process

To perform $spark_home/bin/spark-sql–master yarn, start the spark-sql interactive command line in Yarn-client mode (that is, the driver program runs locally, not yarn container), The log displays memory information about Appmaster and executor as follows:

The log shows that the memory of Appmaster is 896MB, which contains 384MB Memoryoverhead, 5 executor is started, the first available memory is 530.3MB, and the remaining available memory for each executor is 1060.3MB.

To Yarnui look at the use of resources, a total of 5 container, occupied memory 13G, one of the NodeManager started 2 container, occupy memory 4G (1 Appmaster accounted for 1G, another account of 3G), The other 3 units each have 1 container, each occupying 3G of RAM.

To Sparkui look at the executors, there are 5 executor, where driver is run on the local server that executes the Spark-sql command, and the other 4 are running in the yarn cluster. The available storage memory for driver is 530.3MB, and the other 4 are 1060.3MB (consistent with log information).

So here's the question:

1. The minimum amount of memory allocated for container is determined by the YARN.SCHEDULER.MINIMUM-ALLOCATION-MB parameter, and the default is 1G, which is true from Yarnui, but why Spark's log shows that the actual memory of Appmaster is 896-384 yarn. What about =512MB. How the 384MB is calculated.

2. The Spark configuration file specifies that each executor's memory is 2G, and why the log and Sparkui are displayed on 1060.3MB.

3. The memory configuration of the driver is 1G, why the Sparkui in the display is 530.3MB.

4. Why is the memory allocated in each container in the yarn 3G instead of 2G as required by executor? Problem resolution

After some research, found that some of the concepts are easily confused, sorted as follows, the serial number corresponds to the above questions:

1. Spark's yarn-client to ResourceManager request to submit job/start Appmaster, it will determine whether the cluster mode, if the cluster mode, the Appmaster memory size and driver memory size is the same, Otherwise, the default value for this parameter is 512MB, as determined by Spark.yarn.am.memory. We are using the yarn-client mode, so the actual memory is 512MB.

384MB is spark-client for appmaster additional application of the memory, the calculation method is as follows:

That is, the default is read from the parameter (cluster mode is read from the Spark.yarn.driver.memoryOverhead parameter, otherwise read from the Spark.yarn.am.memoryOverhead parameter), if this parameter is not matched, the memory from Appmaster * A certain coefficient and the default minimum overhead take a larger value.

In the spark-1.4.1 version, the default value for Memory_overhead_factor is 0.10 (previously 0.07), memory_overhead_min defaults to 384, We didn't specify Spark.yarn.driver.memoryOverhead and Spark.yarn.am.memoryOverhead, and ammemory= 512M (determined by spark.yarn.am.memory), so Memoryoverhead is Max (512*0.10, 384) =384mb.

Executor's Memoryoverhead calculation method is the same as this, except that the cluster mode is not differentiated by default and is configured by Spark.yarn.executor.memoryOverhead.

2. The log and Sparkui show the memory space inside the executor that is used to cache the results, not all the memory owned by executor. This part of memory is calculated by the following formula:

Runtime.getRuntime.maxMemory by 2048MB, storage memory size is slightly smaller than this value, is normal.

3. As in the 2nd above, the size of the storage memory is slightly smaller than 1024*0.9*0.6=552.96MB

4. As mentioned earlier, Spark will apply some additional memory (memoryoverhead) for container, so the amount of memory actually submitted for container is 2048 + MAX (2048*0.10, 384) = 2432MB, and yarn in doing resource allocation will do the resource regularization, that is, the application of resources must be the smallest amount of resource can be applied to the number of resources (up to the whole), The minimum amount of memory that can be applied is specified by YARN.SCHEDULER.MINIMUM-ALLOCATION-MB, so the container is allocated 3G of RAM.

——————————————————— validation

To verify the above rules, continue modifying the configuration parameters:

spark_executor_instances=4 number of EXECUTOR processes initiated in the yarn cluster

SPARK_EXECUTOR_MEMORY=4G The amount of memory allocated for each EXECUTOR process

SPARK_DRIVER_MEMORY=3G size of memory allocated for Spark-driver process

and specify the Spark.yarn.am.memory parameter when starting Spark-sql:

Bin/spark-sql–master yarn–conf spark.yarn.am.memory=1024m

Look at the log information again:

Yarnui Status:

Sparkui's Executors information:

Visible, the actual memory of Appmaster is 1024M (1408-384), and its container memory size in yarn is 2G (1408 is greater than 1g,yarn allocated 2G according to the principle of resource regularization).

Similarly, driver storage memory space for 3g*0.9*0.6=1.62g,executor storage memory space for 4G*0.9*0.6=2.16G, Executor container occupies 5G of memory (4096+max (4096*0.10,384) = 4505.6, greater than 4G, yarn assigned 5G according to the principle of resource regularization.

The total memory footprint of the yarn cluster is 2+5*4=22g.

(This article has been read 174 times)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.