Spark Standalone cluster installation

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article will not engage in yarn mashup spark, just want to build a pure spark environment to facilitate the learning comprehension of the initial stage. Create a Spark service run account

# Useradd Smile

The smile account is the running account for the Spark service.

Download the installation package and test

Under root account, download the latest installation package, note that it is not the source, but the bin installation package, support hadoop2.6 later

wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz

Unzip to the directory below and set the owner and group to the Smile account, and then establish the link.

wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
Tar zxvf Spark-1.5.1-bin-hadoop2.6.tgz
chown-r smile:smile spark-1.5.1-bin-hadoop2.6
ln-s spark-1.5.1-bin-hadoop2.6 Spark
Chown-r Smile:smile Spark

Go to Catalog

cd/data/slot0/spark/
./sbin/start-master.sh
starting Org.apache.spark.deploy.master.Master, logging to/ data/slot0/spark-1.5.1-bin-hadoop2.6/sbin/. /logs/spark-smile-org.apache.spark.deploy.master.master-1-10-149-11-157.out

Started successfully. View Web Interface http://your-host:8080

The test was successful. It's easy to close the command.

$ sbin/stop-master.sh
stopping Org.apache.spark.deploy.master.Master

build a highly available cluster based on zookeeper use three nodes as Master

Now we are going to set up a master cluster with 3 servers, use zookeeper for the election, make sure there is always a master leader, and the other two are always master slave

On the first server, go to the spark/conf directory and copy spark-env.sh.template to spark-env.sh file

Then add the following settings

Spark_daemon_java_opts= "-dspark.deploy.recoverymode=zookeeper-dspark.deploy.zookeeper.url= 10.149.11.146:2181,10.149.11.147:2181,10.149.11.148:2181-dspark.deploy.zookeeper.dir=/vehicle_spark "
Export spark_daemon_java_opts

Start service as Master

./sbin/start-master.sh

Start start-master.sh on the next two nodes in turn, at which point 3 nodes can open the master state site with http://ip:8080 to start the subsequent nodes as slave

Start slave on several other spark servers

./sbin/start-slave.sh spark://host1:7077,host2:7077,host3:7077

Attention:

1. Host1, Host2, Host3 must be from several master 8080 sites, if IP instead of connection will be rejected

2. Slave start successfully, you can open the worker's UI site on port 8081, which will display the current master leader

Now the 3 master 8080 ports show the status of the worker.

Test Connection Master with Shell

$./bin/spark-shell--master spark://10-149-11-*:7077,10-149-11-*:7077,10-149-11-*:7077 Log4j:WARN No Appenders could
is found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
Log4j:warn Initialize the log4j system properly.
Log4j:warn See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark ' s repl log4j profile:org/apache/spark/log4j-defaults-repl.properties to adjust logging level use Sc.setlogle Vel ("INFO") Welcome to ____ __/__/__ ___ _____//__ _\ \ _/_ '/__/' _//___/. __/\ _,_/_//_/\_\ version 1.5.2/_/Using Scala version 2.10.4 (Java HotSpot (TM) 64-bit Server VM, Java 1.8.0_45) Typ
E in expressions to has them evaluated.
Type:help for more information.
15/11/16 13:22:37 WARN metricssystem:using default name Dagscheduler for source because spark.app.id are not set. Spark context available as SC. 15/11/16 13:22:39 WARN CONNECTION:BONECP specified but not present in CLASSPATH (or one of dependencies) 15/11/16 13:22:39 WARN CONNECTION:BONECP specified but not present in CLASSPATH (or one of the depend encies) 15/11/16 13:23:15 WARN objectstore:version information not found in Metastore. Hive.metastore.schema.verification is isn't enabled so recording the schema version 1.2.0 15/11/16 13:23:15 WARN objectstore : Failed to get database default, returning nosuchobjectexception 15/11/16 13:23:21 WARN nativecodeloader:unable to load Native-hadoop Library for your platform ... using Builtin-java classes where applicable 15/11/16 13:23:21 WARN Connection:  BONECP specified but not present in CLASSPATH (or one of dependencies) 15/11/16 13:23:22 WARN CONNECTION:BONECP specified

But isn't present in CLASSPATH (or one of dependencies) SQL context available as SqlContext. Scala>

Watch the Web UI and zookeeper, everything works.

use environment variables to set the master URL because Spark-shell can obtain the spark master information by reading the environment variable, for convenience, do not enter a long parameter each time to add in ~/.BASHRC

Export master=spark://10-149-*-*:7077,10-149-*-*:7077,10-149-*-*:7077

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Spark Standalone cluster installation

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support