Spark Start Mode

Source: Internet
Author: User

1. How spark submits the task


1), Spark on yarn:

$./bin/spark-submit--class org.apache.spark.examples.SparkPi \

--master yarn-cluster \

--num-executors 3 \

--driver-memory 4g \

--executor-memory 2g \

--executor-cores 1 \

--queue thequeue \

Lib/spark-examples*.jar \

10

2), spark on yarn when submitting a task: in yarn-cluster cluster mode, the driver runs on different machines than the client, so Sparkcontext.addjar will not solve the client's local file box. In the Sparkcontext.addjar client file, include their--jars option in the Start command.

$./bin/spark-submit--class my.main.Class \

--master yarn-cluster \

--jarsmy-other-jar.jar,my-other-other-jar.jar

My-main-jar.jar

App_arg1 APP_ARG2

Test the PI program that comes with Spark,

./bin/spark-submit--class org.apache.spark.examples.SparkPi \

--master Yarn-cluster\

--num-executors 1 \

--driver-memory 1g \

--executor-memory 1g \

--executor-cores 1 \

Lib/spark-examples*.jar\

3), Spark-submit:

Spark-submit Test pi:

The Spark-submit script in the bin subdirectory of Spark is a tool for submitting programs to run in a cluster, and we use this tool to do a calculation about pi. The command is as follows:

./bin/spark-submit--master spark://spark113:7077 \

--class org.apache.spark.examples.SparkPi \--name spark-pi--executor-memory 400M \--driver-memory 512M \

/home/hadoop/spark-1.0.0/examples/target/scala-2.10/spark-examples-1.0.0-hadoop2.0.0-cdh4.5.0.jar

Spark-submit Test:

/home/hadoop/spark/spark-1.3.0-bin-hadoop2.4/bin/spark-submit\

--CLASSORG.APACHE.SPARK.EXAMPLES.SPARKPI \

--masterspark://192.168.6.71:7077 \

--executor-memory100m \

--executor-cores 1 \

1000

4), start Spark-shell in cluster mode:

./spark-shell--master spark://hadoop1:7077--executor-memory 500m


2, Spark start mode:

1), local mode start Spark:./spark-shell--master local[2] Note: Multiple threads can be specified

2), cluster mode start spark:

[Hadoop@hadoop1 spark-1.3.0-bin-hadoop2.4]$./bin/spark-shell--masterspark://hadoop1:7077--executor-memory500m Note: This startup mode specifies that the executor memory on each machine that is Spark-shell run is 500m

Spark-shell--masteryarn-client--driver-memory 10g--num-executors--executor-memory 20g--executor-cores 3--queue Spark

3), start Spark:bin/pyspark--master local[3 in the Python interpreter]

4), start Spark:bin/sparkr--master local[2] in the interpreter of the R language

5), yarn of the way to start Spark:yarn cluster boot spark:$./bin/spark-shell--master Yarn-cluster

Yarn client starts spark:$./bin/spark-shell--masteryarn-client

Spark-sql--masteryarn-client--driver-memory 10g--num-executors--executor-memory 20g--executor-cores 3--queue Spark

Spark-sql--masterspark://master:7077--driver-memory 10g--executor-memory 20g--driver-cores 3

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.