Spark Oom:java heap SPACE,OOM:GC overhead limit exceeded workaround

Source: Internet
Author: User
Tags gc overhead limit exceeded

Problem Description:

In the process of using spark, there are two types of errors that sometimes occur because of data increase:

Java.lang.OutOfMemoryError:Java Heap Space

Java.lang.OutOfMemoryError:GC Overhead limit exceeded

These two kinds of errors before I always think is executor memory to give enough, but careful analysis found not executor memory to the lack of, but driver memory to give insufficient. When submitting a task with Spark-submit in standalone client mode (standalone mode is deployed by default, standalone client mode submission task is used), our own program (main) is called driver, When you do not assign memory to driver, the default allocation is 512M. In this case, if the data being processed or the data being loaded is large (I am loading data from hive), the driver can burst memory and the Oom error above occurs.

Workaround:

Reference: http://spark.apache.org/docs/latest/configuration.html

Method One: Specify the--driver-memory memsize parameter in Spark-submit to set the JVM memory size of driver, and you can view other parameters that can be set by Spark-submit--help.

eg

./spark-submit --master spark:// Span style= "color: #008000;" >7070 \ --class $MAIN _class --executor-   Memory 3G --total-executor-cores 10    --driver- memory 2g  --name $APP _name --conf  " spark.executor.extrajavaoptions=-xx:+printgcdetails-xx:+printgctimestamps   "     $SPARK _app_jar   

Method Two: In the spark_home/conf/directory, a copy of the spark-defaults.conf.template template file is copied to the/spark_home/conf directory, Name Spark-defaults.conf, then set the Spark.driver.memory Memsize property inside to change the driver memory size.

eg

Spark.master                       Spark://master:7077 spark.default.parallelism             Spark.driver.memory                2g spark.serializer                   Org.apache.spark.serializer.KryoSerializer Spark.sql.shuffle.partitions       

Spark Oom:java heap SPACE,OOM:GC overhead limit exceeded workaround

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.