This article will not engage in yarn mashup spark, just want to build a pure spark environment to facilitate the learning comprehension of the initial stage. Create a Spark service run account
# Useradd Smile
The smile account is the running account for the Spark service.
Download the installation package and test
Under root account, download the latest installation package, note that it is not the source, but the bin installation package, support hadoop2.6 later
wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
Unzip to the directory below and set the owner and group to the Smile account, and then establish the link.
wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
Tar zxvf Spark-1.5.1-bin-hadoop2.6.tgz
chown-r smile:smile spark-1.5.1-bin-hadoop2.6
ln-s spark-1.5.1-bin-hadoop2.6 Spark
Chown-r Smile:smile Spark
Go to Catalog
cd/data/slot0/spark/
./sbin/start-master.sh
starting Org.apache.spark.deploy.master.Master, logging to/ data/slot0/spark-1.5.1-bin-hadoop2.6/sbin/. /logs/spark-smile-org.apache.spark.deploy.master.master-1-10-149-11-157.out
Started successfully. View Web Interface http://your-host:8080
The test was successful. It's easy to close the command.
$ sbin/stop-master.sh
stopping Org.apache.spark.deploy.master.Master
build a highly available cluster based on zookeeper use three nodes as Master
Now we are going to set up a master cluster with 3 servers, use zookeeper for the election, make sure there is always a master leader, and the other two are always master slave
On the first server, go to the spark/conf directory and copy spark-env.sh.template to spark-env.sh file
Then add the following settings
Spark_daemon_java_opts= "-dspark.deploy.recoverymode=zookeeper-dspark.deploy.zookeeper.url= 10.149.11.146:2181,10.149.11.147:2181,10.149.11.148:2181-dspark.deploy.zookeeper.dir=/vehicle_spark "
Export spark_daemon_java_opts
Start service as Master
./sbin/start-master.sh
Start start-master.sh on the next two nodes in turn, at which point 3 nodes can open the master state site with http://ip:8080 to start the subsequent nodes as slave
Start slave on several other spark servers
./sbin/start-slave.sh spark://host1:7077,host2:7077,host3:7077
Attention:
1. Host1, Host2, Host3 must be from several master 8080 sites, if IP instead of connection will be rejected
2. Slave start successfully, you can open the worker's UI site on port 8081, which will display the current master leader
Now the 3 master 8080 ports show the status of the worker.
Test Connection Master with Shell
$./bin/spark-shell--master spark://10-149-11-*:7077,10-149-11-*:7077,10-149-11-*:7077 Log4j:WARN No Appenders could
is found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
Log4j:warn Initialize the log4j system properly.
Log4j:warn See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark ' s repl log4j profile:org/apache/spark/log4j-defaults-repl.properties to adjust logging level use Sc.setlogle Vel ("INFO") Welcome to ____ __/__/__ ___ _____//__ _\ \ _/_ '/__/' _//___/. __/\ _,_/_//_/\_\ version 1.5.2/_/Using Scala version 2.10.4 (Java HotSpot (TM) 64-bit Server VM, Java 1.8.0_45) Typ
E in expressions to has them evaluated.
Type:help for more information.
15/11/16 13:22:37 WARN metricssystem:using default name Dagscheduler for source because spark.app.id are not set. Spark context available as SC. 15/11/16 13:22:39 WARN CONNECTION:BONECP specified but not present in CLASSPATH (or one of dependencies) 15/11/16 13:22:39 WARN CONNECTION:BONECP specified but not present in CLASSPATH (or one of the depend encies) 15/11/16 13:23:15 WARN objectstore:version information not found in Metastore. Hive.metastore.schema.verification is isn't enabled so recording the schema version 1.2.0 15/11/16 13:23:15 WARN objectstore : Failed to get database default, returning nosuchobjectexception 15/11/16 13:23:21 WARN nativecodeloader:unable to load Native-hadoop Library for your platform ... using Builtin-java classes where applicable 15/11/16 13:23:21 WARN Connection: BONECP specified but not present in CLASSPATH (or one of dependencies) 15/11/16 13:23:22 WARN CONNECTION:BONECP specified
But isn't present in CLASSPATH (or one of dependencies) SQL context available as SqlContext. Scala>
Watch the Web UI and zookeeper, everything works.
use environment variables to set the master URL because Spark-shell can obtain the spark master information by reading the environment variable, for convenience, do not enter a long parameter each time to add in ~/.BASHRC
Export master=spark://10-149-*-*:7077,10-149-*-*:7077,10-149-*-*:7077