Linux installation stand-alone version spark (centos7+spark2.1.1+scala2.12.2) __linux

Source: Internet
Author: User
Tags zookeeper log4j


1 installing spark-dependent Scala

1.2 Configure environment variables for Scala

1.3 validation Scala

2 Download and decompression spark

3 Spark-related configuration

3.1 Configuring environment variables

3.2 Configure the files in the Conf directory

3.2.1 New Spark-env.h file

3.2.2 New Slaves file

4 test stand-alone mode of spark

4.1 running Spark sample program in stand-alone mode

4.2 start Spark shell command Line window

keywords: Linux CentOS Spark Scala Java

version number: CentOS7 Spark2.1.1 Scala2.12.2 JDK1.8

 

Description: Stand-alone version of the Spark machine only need to install Scala and JDK, other things such as Hadoop, zookeeper and so on can not be installed. If you are installing a spark cluster based on Hadoop, refer to this blog post:

http://blog.csdn.net/pucao_cug/article/details/72353701

1 installing Spark-dependent Scala

Scala's download and decompression can refer to the corresponding chapters in the following blog, and the steps and methods are exactly the same:

http://blog.csdn.net/pucao_cug/article/details/72353701

1.2 Configure environment variables for Scala

Edit/etc/profile This file to add one line of configuration to the file:

Export scala_home=/opt/scala/scala-2.12.2

Add the following in the path variable of the file:

${scala_home}/bin

After the addition is complete, my/etc/profile configuration is as follows:

Export java_home=/opt/java/jdk1.8.0_121
export scala_home=/opt/scala/scala-2.12.2
export class_path=.:${ Java_home}/lib: $CLASS _path
export path=.:${java_home}/bin: ${scala_home}/bin: $PATH

When the environment variable configuration is complete, execute the following command:

Source/etc/profile 1.3 Validation Scala

To execute a command:

scala-version

as shown in figure:

   

2 Download and decompression spark

Download and decompress spark can refer to the download and decompression chapters of this blog, and the steps and methods are identical:

http://blog.csdn.net/pucao_cug/article/details/72353701

On a stand-alone Spark machine, just install JDK, Scala, Spark

as shown in figure:

3 Spark-related configuration

Description: Because we build a spark cluster based on the Hadoop cluster, I have spark installed on every Hadoop node, and I need to configure it in the following steps, starting with the Spark Cluster master machine. I am here to start on the hserver1. 3.1 Configuring environment variables

Edit/etc/profile file to increase

Export spark_home=/opt/spark/spark-2.1.1-bin-hadoop2.7

Add the following in the path variable of the file:

${spark_home}/bin

After the modification is complete, the contents of my/etc/profile file are:

Export  java_home=/opt/java/jdk1.8.0_121
export  zk_home=/opt/zookeeper/zookeeper-3.4.10
Export  scala_home=/opt/scala/scala-2.12.2
export  spark_home=/opt/spark/spark-2.1.1-bin-hadoop2.7
Export  class_path=.:${java_home}/lib: $CLASS _path
export  path=.:${java_home}/bin:${spark_home}/bin : ${zk_home}/bin:${scala_home}/bin: $PATH

as shown in figure:

When the edit is complete, execute the command:

Source/etc/profile 3.2 Configure the files in the Conf directory

Configure the files in the/opt/spark/spark-2.1.1-bin-hadoop2.7/conf directory. 3.2.1 New Spark-env.h file

Execute command to enter the/opt/spark/spark-2.1.1-bin-hadoop2.7/conf directory:

cd/opt/spark/spark-2.1.1-bin-hadoop2.7/conf

Create a spark-env.h file with spark for the template we created, and the command is:

CP spark-env.sh.template spark-env.sh

as shown in figure:

Edit the Spark-env.h file and add the configuration (specific path to its own):

Export scala_home=/opt/scala/scala-2.12.2
export java_home=/opt/java/jdk1.8.0_121
export spark_home=/opt/ spark/spark-2.1.1-bin-hadoop2.7
export spark_master_ip=hserver1
export spark_executor_memory=1g

as shown in figure:


3.2.2 New Slaves file

Execute command to enter the/opt/spark/spark-2.1.1-bin-hadoop2.7/conf directory:

cd/opt/spark/spark-2.1.1-bin-hadoop2.7/conf

Create a slaves file with spark for the template we created, and the command is:

cp slaves.template Slaves

As shown in figure:

Edit the slaves file, which reads:

localhost

As shown in figure:

4 test stand-alone mode of Spark 4.1 running Spark sample program in stand-alone mode

When the above configuration is complete, you do not need to start anything, just execute the command below.

Go to the home directory, which is to execute the following command:

cd/opt/spark/spark-2.1.1-bin-hadoop2.7

Execute a command to run the demo program that calculates PI:

./bin/run-example SPARKPI

as shown in figure:

after a few seconds, execution completes

as shown in figure:

complete information is:

[Root@hserver1 ~]# cd/opt/spark/spark-2.1.1-bin-hadoop2.7 [Root@hserver1 spark-2.1.1-bin-hadoop2.7]#./bin/ Run-example SPARKPI the Using Spark ' s default log4j profile:org/apache/spark/log4j-defaults.properties 17/05/17 11:43:21  INFO sparkcontext:running Spark Version 2.1.1 17/05/17 11:43:22 WARN nativecodeloader:unable to load Native-hadoop Library  For your platform ... using builtin-javaclasses where applicable 17/05/17 11:43:25 INFO securitymanager:changing View ACLs To:root 17/05/17 11:43:25 Info securitymanager:changing Modify ACLS to:root 17/05/17 11:43:25 INFO Securitymanager:chan Ging View ACLs groups TO:17/05/17 11:43:25 info securitymanager:changing Modify ACLS groups To:17/05/17 11:43:25 info Se CurityManager:SecurityManager:authentication Disabled; UI ACLs Disabled; Users with view permissions:set (root); Groups Withview Permissions:set (); Users withmodify Permissions:set (root); Groups with Modify Permissions:set () 17/05/17 11:43:25 INFO utils:successfullystarted ServiCe ' sparkdriver ' on port 42970. 17/05/17 11:43:26 Info sparkenv:registering mapoutputtracker 17/05/17 11:43:26 Info sparkenv:registering Blockmanagermaster 17/05/17 11:43:26 infoblockmanagermasterendpoint:using Org.apache.spark.storage.DefaultTopologyMapperfor getting topology information 17/05/17 11:43:26 Infoblockmanagermasterendpoint:blockmanagermasterendpoint up 17/05/17 11:43:26 INFO diskblockmanager:created Local Directory at/tmp/blockmgr-fa083902-0f9c-4e44-9712-93fe301a4895 17/05/17 11:43:26 INFO memorystore:memorystore Started with capacity 413.9 MB 17/05/17 11:43:26 info sparkenv:registering outputcommitcoordinator 17/05/17 11:43:27 Info
utils:successfullystarted Service ' Sparkui ' on port 4040. 17/05/17 11:43:27 Info Sparkui:boundsparkui to 0.0.0.0, and started at http://192.168.27.144:4040 17/05/17 info Sparkcontext:addedjarfile:/opt/spark/spark-2.1.1-bin-hadoop2.7/examples/jars/scopt_2.11-3.3.0.jar atspark:// 192.168.27.144:42970/jars/scopt_2.11-3.3.0.jar withtimestamp1494992607195 17/05/17 11:43:27 INFO sparkcontext:addedjarfile:/opt/spark/spark-2.1.1-bin-hadoop2.7/ Examples/jars/spark-examples_2.11-2.1.1.jarat Spark://192.168.27.144:42970/jars/spark-examples_2.11-2.1.1.jar Withtimestamp 1494992607196 17/05/17 11:43:27 INFO executor:startingexecutor ID driver on host localhost 17/05/17
7 INFO utils:successfullystarted Service ' Org.apache.spark.network.netty.NettyBlockTransferService ' Onport 43732. 17/05/17 11:43:27 Infonettyblocktransferservice:server created on 192.168.27.144:43732 17/05/17 11:43:27 INFO BlockManager:Usingorg.apache.spark.storage.RandomBlockReplicationPolicy for Block Replicationpolicy 17/05/17 11:43:27 INFO blockmanagermaster:registering Blockmanager blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:27 infoblockmanagermasterendpoint:registering block manager 192.168.27.144:43732 with413.9 MB RAM, Blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:27 INFO Blockmanagermaster:registered Blockmanager Blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:27 INFO blockmanager:i Nitialized Blockmanager:blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:28 INFO sharedstate:
Warehouse path is ' file:/opt/spark/spark-2.1.1-bin-hadoop2.7/spark-warehouse/'.  17/05/17 11:43:29 Info sparkcontext:starting job:reduce at sparkpi.scala:38 17/05/17 11:43:29 INFO dagscheduler:gotjob 0 (reduce at sparkpi.scala:38) with ten output partitions 17/05/17 11:43:29 INFO DAGScheduler:Finalstage:ResultStage 0 (re Duce at sparkpi.scala:38) 17/05/17 11:43:29 info dagscheduler:parents of Final stage:list () 17/05/17 11:43:29 INFO DAGSch  Eduler:missing parents:list () 17/05/17 11:43:29 INFO dagscheduler:submittingresultstage 0 (MapPartitionsRDD[1] at map at SPARKPI.SCALA:34), which has nomissing parents 17/05/17 11:43:29 INFO memorystore:blockbroadcast_0 stored as values in M Emory (estimated size 1832.0 B, free 413.9 MB) 17/05/17 11:43:30 INFO Memorystore:blockBroadcast_0_piece0 stored as bytes in memory (estimated size 1167.0 B, free413.9 MB) 17/05/17 11:43:30 INFO Blockmanagerin Fo:added Broadcast_0_piece0 in Memory on 192.168.27.144:43732 (size:1167.0 b,free:413.9 MB) 17/05/17 11:43:30 INFO Spark Context:created broadcast 0 from broadcast at dagscheduler.scala:996 17/05/17 11:43:30 INFO dagscheduler:submitting mis Sing tasks from Resultstage 0 (mappartitionsrdd[1] at map atsparkpi.scala:34) 17/05/17 11:43:30 INFO Taskschedulerimpl:add ing task set 0.0 with the tasks 17/05/17 11:43:30 INFO tasksetmanager:starting task 0.0 in stage 0.0 (TID 0, localhost, exe 
Cutor driver, partition 0,process_local, 6090 bytes) 17/05/17 11:43:30 INFO executor:runningtask 0.0 in stage 0.0 (TID 0) 17/05/17 11:43:30 INFO Executor:fetchingspark://192.168.27.144:42970/jars/scopt_2.11-3.3.0.jar with timestamp1494992607195 17/05/17 11:43:30 infotransportclientfactory:successfully created connection 192.168.27.144:42970 after 129 mobile suit (0 Ms spent in bootstraps) 17/05/17 11:43:30 INFO Utils:fetchingspark://192.168.27.144:42970/jars/scopt_2.11-3.3.0.jar to/tmp/ spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/ Fetchfiletemp3940062650819619408.tmp 17/05/17 11:43:31 INFO executor:addingfile:/tmp/ spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/scopt_2.11-3.3.0. Jarto class loader 17/05/17 11:43:31 INFO Executor:fetchingspark://192.168.27.144:42970/jars/spark-examples_ 2.11-2.1.1.jar with timestamp1494992607196 17/05/17 11:43:31 INFO utils:fetchingspark://192.168.27.144:42970/jars/ Spark-examples_2.11-2.1.1.jar to/tmp/spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/ Userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/fetchfiletemp2400538401087766507.tmp 17/05/17 11:43:31 INFO executor:addingfile:/tmp/spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/ Userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/spark-examples_2.11-2.1.1.jarto class loader 17/05/17 11:43:31 INFO Executor:finishedtask0.0 in stage 0.0 (TID 0). 1114 bytes result sent to driver 17/05/17 11:43:31 INFO tasksetmanager:starting Task 1.0 in Stage 0.0 (TID 1, localhost, E Xecutor driver, partition 1,process_local, 6090 bytes) 17/05/17 11:43:31 INFO Executor:runningtask 1.0 in stage 0.0 (TID 1) 17/05/17 11:43:31 INFO tasksetmanager:finished task 0.0 in stage 0.0 (TID 0) in 1594 MS on localhost (executordriver) ( 1/10) 17/05/17 11:43:31 INFO Executor:finishedtask 1.0 in Stage 0.0 (TID 1). 1114 bytes result sent to driver 17/05/17 11:43:31 INFO tasksetmanager:starting Task 2.0 in stage 0.0 (TID 2, localhost, E Xecutor driver, partition 2,process_local, 6090 bytes) 17/05/17 11:43:31 INFO executor:runningtask 2.0 in stage 0.0 (TID 2) 17/05/17 11:43:31 INFO tasksetmanager:finished Task 1.0 in Stage 0.0 (TID 1) in 239 ms on localhost (executor driver) (2 /10) 17/05/17 11:43:32 INFO executor:finishedtask 2.0 in stage 0.0 (TID 2). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3,process_local, 6090 bytes) 17/05/17 11:43:32 INFO Executor: Runningtask 3.0 in stage 0.0 (TID 3) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 2.0 in stage 0.0 (TID 2) in 135 Ms on localhost (executor driver) (3/10) 17/05/17 11:43:32 INFO Executor:finishedtask 3.0 in stage 0.0 (TID 3). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 4.0 in stage 0.0 (TID 4, localhost, E Xecutor driver, partition 4,process_local, 6090 bytes) 17/05/17 11:43:32 INFO executor:runningtask 4.0 in stage 0.0 (TID 4) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 3.0 in stage 0.0 (TID 3) in the "in-" Ms on localhost (executor driver) (4 /10) 17/05/17 11:43:32 INFO executor:finishedtask 4.0 in stage 0.0 (TID 4). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 5.0 in stage 0.0 (TID 5, localhost, E Xecutor driver, partition 5,process_local, 6090 bytes) 17/05/17 11:43:32 INFO TAsksetmanager:finished Task 4.0 in stage 0.0 (TID 4) in 102 MS on localhost (executor driver) (5/10) 17/05/17 11:43:32 INFO Executor:runningtask 5.0 in Stage 0.0 (TID 5) 17/05/17 11:43:32 INFO executor:finishedtask 5.0 in stage 0.0 (TID 5). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 6.0 in stage 0.0 (TID 6, localhost, E Xecutor driver, partition 6,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 5.0 in stage 0. 0 (TID 5) in 114 ms on localhost (executor driver) (6/10) 17/05/17 11:43:32 INFO executor:runningtask 6.0 in stage 0.0 (TI D 6) 17/05/17 11:43:32 INFO executor:finishedtask 6.0 in stage 0.0 (TID 6). 1114 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 7.0 in stage 0.0 (TID 7, localhost, E Xecutor driver, partition 7,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 6.0 in stage 0. 0 (TID 6) in localhost (executor driver) (7/10) 17/05/17 11:43:Info Executor:runningtask 7.0 in stage 0.0 (TID 7) 17/05/17 11:43:32 INFO executor:finishedtask 7.0 in stage 0.0 (TID 7). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting task 8.0 in stage 0.0 (TID 8, Localho St, executor driver, partition 8,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished task 7.0 in STA GE 0.0 (TID 7) in 117 MS on localhost (executor driver) (8/10) 17/05/17 11:43:32 INFO executor:runningtask 8.0 in stage 0. 0 (TID 8) 17/05/17 11:43:32 INFO Executor:finishedtask 8.0 in stage 0.0 (TID 8). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 9.0 in stage 0.0 (TID 9, localhost, E Xecutor driver, partition 9,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished task 8.0 in stage 0. 0 (TID 8) in-MS on localhost (executor driver) (9/10) 17/05/17 11:43:32 INFO executor:runningtask 9.0 in stage 0.0 (TI D 9) 17/05/17 11:43:32 INFO executor:finishedtask 9.0 in stage0.0 (TID 9). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:finished task 9.0 into stage 0.0 (TID 9) in the MS on localhost (executor driver) (10/10) 17/05/17 11:43:32 INFO taskschedulerimpl:removed TaskSet 0.0, whose tasks have all comp Leted, from Pool 17/05/17 11:43:32 INFO dagscheduler:resultstage 0 (reduce on sparkpi.scala:38) finished in 2.589 s 17/05/ 
11:43:32 info dagscheduler:job 0finished:reduce at sparkpi.scala:38, took 3.388028 s Pi is roughly 3.1393111393111393 17/05/17 11:43:32 INFO Sparkui:stoppedspark Web UI at http://192.168.27.144:4040 17/05/17 11:43:32
Asterendpoint:mapoutputtrackermasterendpoint stopped! 17/05/17 11:43:32 Info memorystore:memorystore cleared 17/05/17 11:43:32 INFO Blockmanager:blockmanager stopped 17/05/17 11:43:32 INFO Blockmanagermaster:blockmanagermaster stopped 17/05/17 11:43:32 infooutputcommitcoordinator$
Outputcommitcoordinatorendpoint:outputcommitcoordinator stopped! 17/05/17 11:43:32 INFO SparkcoNtext:successfully stopped Sparkcontext 17/05/17 11:43:33 INFO shutdownhookmanager:shutdown Hook called 17/05/17 11:43:33 INFO shutdownhookmanager:deleting Directory/tmp/spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7 [root@hserver1 spark-2.1.1-bin-hadoop2.7]#

4.2 start Spark shell command Line window

Go to the home directory, which is to execute the following command:

cd/opt/spark/spark-2.1.1-bin-hadoop2.7

execute command, start script:

./bin/spark-shell

as shown in figure:

        

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.