Spark Standalone cluster installation

Source: Internet
Author: User

This article will not engage in yarn mashup spark, just want to build a pure spark environment to facilitate the learning comprehension of the initial stage. Create a Spark service run account

# Useradd Smile

The smile account is the running account for the Spark service.


Download the installation package and test

Under root account, download the latest installation package, note that it is not the source, but the bin installation package, support hadoop2.6 later

wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz

Unzip to the directory below and set the owner and group to the Smile account, and then establish the link.

wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
Tar zxvf Spark-1.5.1-bin-hadoop2.6.tgz
chown-r smile:smile spark-1.5.1-bin-hadoop2.6
ln-s spark-1.5.1-bin-hadoop2.6 Spark
Chown-r Smile:smile Spark

Go to Catalog

cd/data/slot0/spark/
./sbin/start-master.sh
starting Org.apache.spark.deploy.master.Master, logging to/ data/slot0/spark-1.5.1-bin-hadoop2.6/sbin/. /logs/spark-smile-org.apache.spark.deploy.master.master-1-10-149-11-157.out

Started successfully. View Web Interface http://your-host:8080



The test was successful. It's easy to close the command.

$ sbin/stop-master.sh
stopping Org.apache.spark.deploy.master.Master


build a highly available cluster based on zookeeper use three nodes as Master

Now we are going to set up a master cluster with 3 servers, use zookeeper for the election, make sure there is always a master leader, and the other two are always master slave

On the first server, go to the spark/conf directory and copy spark-env.sh.template to spark-env.sh file

Then add the following settings

Spark_daemon_java_opts= "-dspark.deploy.recoverymode=zookeeper-dspark.deploy.zookeeper.url= 10.149.11.146:2181,10.149.11.147:2181,10.149.11.148:2181-dspark.deploy.zookeeper.dir=/vehicle_spark "
Export spark_daemon_java_opts

Start service as Master

./sbin/start-master.sh

Start start-master.sh on the next two nodes in turn, at which point 3 nodes can open the master state site with http://ip:8080 to start the subsequent nodes as slave

Start slave on several other spark servers

./sbin/start-slave.sh spark://host1:7077,host2:7077,host3:7077

Attention:

1. Host1, Host2, Host3 must be from several master 8080 sites, if IP instead of connection will be rejected

2. Slave start successfully, you can open the worker's UI site on port 8081, which will display the current master leader

Now the 3 master 8080 ports show the status of the worker.


Test Connection Master with Shell

$./bin/spark-shell--master spark://10-149-11-*:7077,10-149-11-*:7077,10-149-11-*:7077 Log4j:WARN No Appenders could
is found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
Log4j:warn Initialize the log4j system properly.
Log4j:warn See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark ' s repl log4j profile:org/apache/spark/log4j-defaults-repl.properties to adjust logging level use Sc.setlogle Vel ("INFO") Welcome to ____ __/__/__ ___ _____//__ _\ \ _/_ '/__/' _//___/. __/\ _,_/_//_/\_\ version 1.5.2/_/Using Scala version 2.10.4 (Java HotSpot (TM) 64-bit Server VM, Java 1.8.0_45) Typ
E in expressions to has them evaluated.
Type:help for more information.
15/11/16 13:22:37 WARN metricssystem:using default name Dagscheduler for source because spark.app.id are not set. Spark context available as SC. 15/11/16 13:22:39 WARN CONNECTION:BONECP specified but not present in CLASSPATH (or one of dependencies) 15/11/16 13:22:39 WARN CONNECTION:BONECP specified but not present in CLASSPATH (or one of the depend encies) 15/11/16 13:23:15 WARN objectstore:version information not found in Metastore. Hive.metastore.schema.verification is isn't enabled so recording the schema version 1.2.0 15/11/16 13:23:15 WARN objectstore : Failed to get database default, returning nosuchobjectexception 15/11/16 13:23:21 WARN nativecodeloader:unable to load Native-hadoop Library for your platform ... using Builtin-java classes where applicable 15/11/16 13:23:21 WARN Connection:  BONECP specified but not present in CLASSPATH (or one of dependencies) 15/11/16 13:23:22 WARN CONNECTION:BONECP specified

But isn't present in CLASSPATH (or one of dependencies) SQL context available as SqlContext. Scala>

Watch the Web UI and zookeeper, everything works.


use environment variables to set the master URL because Spark-shell can obtain the spark master information by reading the environment variable, for convenience, do not enter a long parameter each time to add in ~/.BASHRC

Export master=spark://10-149-*-*:7077,10-149-*-*:7077,10-149-*-*:7077




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.