One, the Spark runtime schema:
The Spark distributed architecture takes a master/slave architecture pattern. The Master is the drive (Driver) node, which is responsible for central coordination and scheduling of each work (actuator executor) node.
From the actuator (executor) node.
Spark drive nodes and executor nodes are collectively known as spark applications. The spark application is started on the cluster's machine through the cluster manager.
Second, the drive and actuator tasks:
Drive tasks: Responsible for running the tasks that make up the spark job;
Executor task: Provide in-memory storage for the RDD that requires caching.
Third, cluster Manager
Cluster Manager can be used to start the drive node, the executor node. In general, the following cluster managers are included: The Hadoop Yarn,apache Mesos,spark comes with the cluster manager.
Iv. format of Spark-submit:
/bin/spark-submit [options] <app jar | Python file> [app options]
V. Spark Performance Tuning:
How to tune and debug spark workloads in a production environment.
1. Adjust the run-time configuration options for the Spark app. Use the Sparkconf class to configure Spark.
Val conf = new sparkconf ()
Conf.set ("Spark.app.name", "My Spark App")
Conf.set ("Spark.master", "local[4]")
Conf.set ("Spark.ui.port", "36000")
Val sc = new Sparkcontext (conf)
2. When submitting a task to Spark-submit, you can use the--conf option.
For example Bin/spark-submit--class com.vip.SimpleClass--master local[4]--name "My spark App"--conf spark.ui.port=36000 MYAPP.J Ar
3. Specify the path to the configuration file by Spark-submit's--properties-file tag:
Spark-submit--class com.vip.SimpleClass--properties-file my-config.conf
Now that there are 3 different ways to set parameters, if there is a conflict after 3, then there is a priority difference between the 3 ways. Priority levels from high to low are three-to-one. In the event of a conflict, the 1th type shall prevail.
This article is from the "Snowflake" blog, make sure to keep this source http://6216083.blog.51cto.com/6206083/1852832
Spark Monitoring tuning