Java.lang.OutOfMemoryError:GC overhead limit exceeded and java.lang.OutOfMemoryError:java heap space appear when Spark executes a task
The most direct solution is to adjust the following two parameters in spark-env.sh as large as possible
Export spark_executor_memory=6000m
Export spark_driver_memory=7000m
Note that the two parameter settings need to be aware of the size order:
Spark_executor_memory < spark_driver_memory< yarn cluster each NodeManager memory size
Summarize the JVM parameter settings for each role in spark:
(1) Driver JVM parameters:
-xmx,-xms, if it is yarn-client mode, The spark_driver_memory value in the Spark-env file is read by default, the-XMX,-XMS value is the same size, and if it is Yarn-cluster mode, The spark.driver.extraJavaOptions corresponding JVM parameter value in the spark-default.conf file is read.
permsize, in the case of yarn-client mode, The java_opts= "-xx:maxpermsize=256m $OUR _java_opts" value in the Spark-class file is read by default, and if it is Yarn-cluster mode, Reads the JVM parameter values corresponding to the spark.driver.extraJavaOptions in the spark-default.conf file.
&NBSP;&NBSP;&NBSP;&NBSP;GC way, if it is yarn-client mode, The default read is java_opts in the Spark-class file, and if it is Yarn-cluster mode, the Read is the corresponding parameter value for the spark.driver.extraJavaOptions in the spark-default.conf file.
(2) JVM Parameters for Executor:
-XMX,-XMS, if it is yarn-client mode, the Spark_executor_memory value in the Spark-env file is read by default, the-XMX,-XMS value is the same size, and if it is Yarn-cluster mode, The spark.executor.extraJavaOptions corresponding JVM parameter value in the spark-default.conf file is read.
PermSize, both modes read the JVM parameter values corresponding to the spark.executor.extraJavaOptions in the spark-default.conf file.
GC mode, both modes are read by the spark.executor.extraJavaOptions corresponding JVM parameter values in the Spark-default.conf file.
(3) Executor number and CPU count
in the case of yarn-client mode, the number of EXECUTOR is specified by spark_executor_instances in spark-env, and the number of each instance is specified by Spark_executor_cores; In the case of Yarn-cluster mode, the number of executor is specified by the--num-executors parameter of the Spark-submit tool, which defaults to 2 instances. The number of CPUs used by each executor is specified by--executor-cores, which defaults to 1 cores.
Reference documents: http://www.cnblogs.com/Scott007/p/3889959.html
Spark appears GC overhead limit exceeded and Java heap space