Spark Monitoring tuning

Source: Internet
Author: User
Tags apache mesos

One, the Spark runtime schema:

The Spark distributed architecture takes a master/slave architecture pattern. The Master is the drive (Driver) node, which is responsible for central coordination and scheduling of each work (actuator executor) node.

From the actuator (executor) node.

Spark drive nodes and executor nodes are collectively known as spark applications. The spark application is started on the cluster's machine through the cluster manager.


Second, the drive and actuator tasks:

Drive tasks: Responsible for running the tasks that make up the spark job;

Executor task: Provide in-memory storage for the RDD that requires caching.


Third, cluster Manager

Cluster Manager can be used to start the drive node, the executor node. In general, the following cluster managers are included: The Hadoop Yarn,apache Mesos,spark comes with the cluster manager.


Iv. format of Spark-submit:

/bin/spark-submit [options] <app jar | Python file> [app options]



V. Spark Performance Tuning:

How to tune and debug spark workloads in a production environment.

1. Adjust the run-time configuration options for the Spark app. Use the Sparkconf class to configure Spark.

Val conf = new sparkconf ()

Conf.set ("Spark.app.name", "My Spark App")

Conf.set ("Spark.master", "local[4]")

Conf.set ("Spark.ui.port", "36000")


Val sc = new Sparkcontext (conf)


2. When submitting a task to Spark-submit, you can use the--conf option.

For example Bin/spark-submit--class com.vip.SimpleClass--master local[4]--name "My spark App"--conf spark.ui.port=36000 MYAPP.J Ar


3. Specify the path to the configuration file by Spark-submit's--properties-file tag:

Spark-submit--class com.vip.SimpleClass--properties-file my-config.conf


Now that there are 3 different ways to set parameters, if there is a conflict after 3, then there is a priority difference between the 3 ways. Priority levels from high to low are three-to-one. In the event of a conflict, the 1th type shall prevail.


This article is from the "Snowflake" blog, make sure to keep this source http://6216083.blog.51cto.com/6206083/1852832

Spark Monitoring tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.