Topic Center

Contact Sales

Home > Others

O & M series: 05, spark on Yarn

Last Update:2014-09-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Spark 0.6.0 supports this function

Preparation: run the spark-on-yarn binary release package that requires spark. Refer to the compilation configuration: environment variable: spark_yarn_user_env. You can set the environment variable of spark on Yarn in this parameter, which can be omitted. Example: spark_yarn_user_env = "java_home =/jdk64, foo = bar ". // Todo can be configured with spark_jar to set the location of spark jar in HDFS. For example, export spark_jar = HDFS: // some/path. Set the variable startup on each hadoop nodemanager node: Make sure that the directory indicated by hadoop_conf_dir or yarn_conf_dir contains the configuration file of the hadoop cluster. These configuration files are used to connect the ResourceManager of yarn and write data to DFS. This is the spark installation for submitting tasks, in order to use the spark-submit tool. Therefore, it can be configured only on this machine. There are two modes: yarn-cluster: spark Driver runs an application master process started by the yarn cluster, and the client disappears after the application is initialized. In the production environment, yarn-client: the spark Driver runs in the client process, and the application master is only used to apply for resources from yarn. Test use? // Todo does not perform verification like spark standalon and mesos modes, where the Master Address uses the specified master parameter; in yarn mode, the ResourceManager address is obtained from the hadoop configuration file. Therefore, in yarn mode, the master parameter is "yarn-client" or "yarn-cluster ". Enable an application in yarn-cluster mode :. /bin/spark-submit -- class path. to. your. class -- master yarn-cluster [Options] <app jar> [App options] example: spark_jar = HDFS: // hansight/libs/spark-assembly-1.0.2-hadoop2.4.0.2.1.4.0-632.jar \ ./bin/spark-submit --class org.apache.spark.examples.SparkPI \ --master yarn-cluster \ --num-executors 3\ --driver-memory 4g \ --executor-memory 2g \ --executor-cores 1 \ lib/spark-examples*.jar \ 10Note: If you start a yarn client, the client starts the Default Application master. Sparkpi runs as a sub-thread in the application master. The client regularly reads the application master to obtain status updates and displays the updates on the console. After your application is run, the client ends. Enable an application in yarn-client mode :. /bin/spark-submit -- master yarn-Client [Options] <app jar> [App options] only changes the parameter value of -- master to yarn-client, all others are the same as yarn-cluster. Add other jar dependencies in yarn-cluster mode. The driver and client run on different machines. Therefore, the sparkcontext. addjar method is not as out-of-the-box as the client is in local mode. To make sparkcontext. addjar available, you need to add these jar files after the startup command parameter -- jars. For example: $ ./bin/spark-submit --class my.main.Class \ --master yarn-cluster \ --jars my-other-jar.jar,my-other-other-jar.jar my-main-jar.jar app_arg1 app_arg2

O & M series: 05, spark on Yarn

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

gateway laptop m series ibm m series keyboard gateway m series laptop battery snap on m 70m tensorflow on spark install apache spark on ubuntu lenovo thinkcentre m series network drivers for windows 7

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

O & M series: 05, spark on Yarn

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support