When the environment variable configuration is complete, execute the following command:
Source/etc/profile 1.3 Validation Scala
To execute a command:
scala-version
as shown in figure:
2 Download and decompression spark
Download and decompress spark can refer to the download and decompression chapters of this blog, and the steps and methods are identical:
http://blog.csdn.net/pucao_cug/article/details/72353701
On a stand-alone Spark machine, just install JDK, Scala, Spark
as shown in figure:
3 Spark-related configuration
Description: Because we build a spark cluster based on the Hadoop cluster, I have spark installed on every Hadoop node, and I need to configure it in the following steps, starting with the Spark Cluster master machine. I am here to start on the hserver1. 3.1 Configuring environment variables
Edit/etc/profile file to increase
Export spark_home=/opt/spark/spark-2.1.1-bin-hadoop2.7
Add the following in the path variable of the file:
${spark_home}/bin
After the modification is complete, the contents of my/etc/profile file are:
Export java_home=/opt/java/jdk1.8.0_121
export zk_home=/opt/zookeeper/zookeeper-3.4.10
Export scala_home=/opt/scala/scala-2.12.2
export spark_home=/opt/spark/spark-2.1.1-bin-hadoop2.7
Export class_path=.:${java_home}/lib: $CLASS _path
export path=.:${java_home}/bin:${spark_home}/bin : ${zk_home}/bin:${scala_home}/bin: $PATH
as shown in figure:
When the edit is complete, execute the command:
Source/etc/profile 3.2 Configure the files in the Conf directory
Configure the files in the/opt/spark/spark-2.1.1-bin-hadoop2.7/conf directory. 3.2.1 New Spark-env.h file
Execute command to enter the/opt/spark/spark-2.1.1-bin-hadoop2.7/conf directory:
cd/opt/spark/spark-2.1.1-bin-hadoop2.7/conf
Create a spark-env.h file with spark for the template we created, and the command is:
CP spark-env.sh.template spark-env.sh
as shown in figure:
Edit the Spark-env.h file and add the configuration (specific path to its own):
Export scala_home=/opt/scala/scala-2.12.2
export java_home=/opt/java/jdk1.8.0_121
export spark_home=/opt/ spark/spark-2.1.1-bin-hadoop2.7
export spark_master_ip=hserver1
export spark_executor_memory=1g
as shown in figure:
3.2.2 New Slaves file
Execute command to enter the/opt/spark/spark-2.1.1-bin-hadoop2.7/conf directory:
cd/opt/spark/spark-2.1.1-bin-hadoop2.7/conf
Create a slaves file with spark for the template we created, and the command is:
cp slaves.template Slaves
As shown in figure:
Edit the slaves file, which reads:
localhost
As shown in figure:
4 test stand-alone mode of Spark 4.1 running Spark sample program in stand-alone mode
When the above configuration is complete, you do not need to start anything, just execute the command below.
Go to the home directory, which is to execute the following command:
cd/opt/spark/spark-2.1.1-bin-hadoop2.7
Execute a command to run the demo program that calculates PI:
./bin/run-example SPARKPI
as shown in figure:
after a few seconds, execution completes
as shown in figure:
complete information is:
[Root@hserver1 ~]# cd/opt/spark/spark-2.1.1-bin-hadoop2.7 [Root@hserver1 spark-2.1.1-bin-hadoop2.7]#./bin/ Run-example SPARKPI the Using Spark ' s default log4j profile:org/apache/spark/log4j-defaults.properties 17/05/17 11:43:21 INFO sparkcontext:running Spark Version 2.1.1 17/05/17 11:43:22 WARN nativecodeloader:unable to load Native-hadoop Library For your platform ... using builtin-javaclasses where applicable 17/05/17 11:43:25 INFO securitymanager:changing View ACLs To:root 17/05/17 11:43:25 Info securitymanager:changing Modify ACLS to:root 17/05/17 11:43:25 INFO Securitymanager:chan Ging View ACLs groups TO:17/05/17 11:43:25 info securitymanager:changing Modify ACLS groups To:17/05/17 11:43:25 info Se CurityManager:SecurityManager:authentication Disabled; UI ACLs Disabled; Users with view permissions:set (root); Groups Withview Permissions:set (); Users withmodify Permissions:set (root); Groups with Modify Permissions:set () 17/05/17 11:43:25 INFO utils:successfullystarted ServiCe ' sparkdriver ' on port 42970. 17/05/17 11:43:26 Info sparkenv:registering mapoutputtracker 17/05/17 11:43:26 Info sparkenv:registering Blockmanagermaster 17/05/17 11:43:26 infoblockmanagermasterendpoint:using Org.apache.spark.storage.DefaultTopologyMapperfor getting topology information 17/05/17 11:43:26 Infoblockmanagermasterendpoint:blockmanagermasterendpoint up 17/05/17 11:43:26 INFO diskblockmanager:created Local Directory at/tmp/blockmgr-fa083902-0f9c-4e44-9712-93fe301a4895 17/05/17 11:43:26 INFO memorystore:memorystore Started with capacity 413.9 MB 17/05/17 11:43:26 info sparkenv:registering outputcommitcoordinator 17/05/17 11:43:27 Info
utils:successfullystarted Service ' Sparkui ' on port 4040. 17/05/17 11:43:27 Info Sparkui:boundsparkui to 0.0.0.0, and started at http://192.168.27.144:4040 17/05/17 info Sparkcontext:addedjarfile:/opt/spark/spark-2.1.1-bin-hadoop2.7/examples/jars/scopt_2.11-3.3.0.jar atspark:// 192.168.27.144:42970/jars/scopt_2.11-3.3.0.jar withtimestamp1494992607195 17/05/17 11:43:27 INFO sparkcontext:addedjarfile:/opt/spark/spark-2.1.1-bin-hadoop2.7/ Examples/jars/spark-examples_2.11-2.1.1.jarat Spark://192.168.27.144:42970/jars/spark-examples_2.11-2.1.1.jar Withtimestamp 1494992607196 17/05/17 11:43:27 INFO executor:startingexecutor ID driver on host localhost 17/05/17
7 INFO utils:successfullystarted Service ' Org.apache.spark.network.netty.NettyBlockTransferService ' Onport 43732. 17/05/17 11:43:27 Infonettyblocktransferservice:server created on 192.168.27.144:43732 17/05/17 11:43:27 INFO BlockManager:Usingorg.apache.spark.storage.RandomBlockReplicationPolicy for Block Replicationpolicy 17/05/17 11:43:27 INFO blockmanagermaster:registering Blockmanager blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:27 infoblockmanagermasterendpoint:registering block manager 192.168.27.144:43732 with413.9 MB RAM, Blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:27 INFO Blockmanagermaster:registered Blockmanager Blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:27 INFO blockmanager:i Nitialized Blockmanager:blockmanagerid (Driver, 192.168.27.144, 43732, None) 17/05/17 11:43:28 INFO sharedstate:
Warehouse path is ' file:/opt/spark/spark-2.1.1-bin-hadoop2.7/spark-warehouse/'. 17/05/17 11:43:29 Info sparkcontext:starting job:reduce at sparkpi.scala:38 17/05/17 11:43:29 INFO dagscheduler:gotjob 0 (reduce at sparkpi.scala:38) with ten output partitions 17/05/17 11:43:29 INFO DAGScheduler:Finalstage:ResultStage 0 (re Duce at sparkpi.scala:38) 17/05/17 11:43:29 info dagscheduler:parents of Final stage:list () 17/05/17 11:43:29 INFO DAGSch Eduler:missing parents:list () 17/05/17 11:43:29 INFO dagscheduler:submittingresultstage 0 (MapPartitionsRDD[1] at map at SPARKPI.SCALA:34), which has nomissing parents 17/05/17 11:43:29 INFO memorystore:blockbroadcast_0 stored as values in M Emory (estimated size 1832.0 B, free 413.9 MB) 17/05/17 11:43:30 INFO Memorystore:blockBroadcast_0_piece0 stored as bytes in memory (estimated size 1167.0 B, free413.9 MB) 17/05/17 11:43:30 INFO Blockmanagerin Fo:added Broadcast_0_piece0 in Memory on 192.168.27.144:43732 (size:1167.0 b,free:413.9 MB) 17/05/17 11:43:30 INFO Spark Context:created broadcast 0 from broadcast at dagscheduler.scala:996 17/05/17 11:43:30 INFO dagscheduler:submitting mis Sing tasks from Resultstage 0 (mappartitionsrdd[1] at map atsparkpi.scala:34) 17/05/17 11:43:30 INFO Taskschedulerimpl:add ing task set 0.0 with the tasks 17/05/17 11:43:30 INFO tasksetmanager:starting task 0.0 in stage 0.0 (TID 0, localhost, exe
Cutor driver, partition 0,process_local, 6090 bytes) 17/05/17 11:43:30 INFO executor:runningtask 0.0 in stage 0.0 (TID 0) 17/05/17 11:43:30 INFO Executor:fetchingspark://192.168.27.144:42970/jars/scopt_2.11-3.3.0.jar with timestamp1494992607195 17/05/17 11:43:30 infotransportclientfactory:successfully created connection 192.168.27.144:42970 after 129 mobile suit (0 Ms spent in bootstraps) 17/05/17 11:43:30 INFO Utils:fetchingspark://192.168.27.144:42970/jars/scopt_2.11-3.3.0.jar to/tmp/ spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/ Fetchfiletemp3940062650819619408.tmp 17/05/17 11:43:31 INFO executor:addingfile:/tmp/ spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/scopt_2.11-3.3.0. Jarto class loader 17/05/17 11:43:31 INFO Executor:fetchingspark://192.168.27.144:42970/jars/spark-examples_ 2.11-2.1.1.jar with timestamp1494992607196 17/05/17 11:43:31 INFO utils:fetchingspark://192.168.27.144:42970/jars/ Spark-examples_2.11-2.1.1.jar to/tmp/spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/ Userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/fetchfiletemp2400538401087766507.tmp 17/05/17 11:43:31 INFO executor:addingfile:/tmp/spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7/ Userfiles-475afa39-559a-43f1-9b42-42e4c68c0562/spark-examples_2.11-2.1.1.jarto class loader 17/05/17 11:43:31 INFO Executor:finishedtask0.0 in stage 0.0 (TID 0). 1114 bytes result sent to driver 17/05/17 11:43:31 INFO tasksetmanager:starting Task 1.0 in Stage 0.0 (TID 1, localhost, E Xecutor driver, partition 1,process_local, 6090 bytes) 17/05/17 11:43:31 INFO Executor:runningtask 1.0 in stage 0.0 (TID 1) 17/05/17 11:43:31 INFO tasksetmanager:finished task 0.0 in stage 0.0 (TID 0) in 1594 MS on localhost (executordriver) ( 1/10) 17/05/17 11:43:31 INFO Executor:finishedtask 1.0 in Stage 0.0 (TID 1). 1114 bytes result sent to driver 17/05/17 11:43:31 INFO tasksetmanager:starting Task 2.0 in stage 0.0 (TID 2, localhost, E Xecutor driver, partition 2,process_local, 6090 bytes) 17/05/17 11:43:31 INFO executor:runningtask 2.0 in stage 0.0 (TID 2) 17/05/17 11:43:31 INFO tasksetmanager:finished Task 1.0 in Stage 0.0 (TID 1) in 239 ms on localhost (executor driver) (2 /10) 17/05/17 11:43:32 INFO executor:finishedtask 2.0 in stage 0.0 (TID 2). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3,process_local, 6090 bytes) 17/05/17 11:43:32 INFO Executor: Runningtask 3.0 in stage 0.0 (TID 3) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 2.0 in stage 0.0 (TID 2) in 135 Ms on localhost (executor driver) (3/10) 17/05/17 11:43:32 INFO Executor:finishedtask 3.0 in stage 0.0 (TID 3). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 4.0 in stage 0.0 (TID 4, localhost, E Xecutor driver, partition 4,process_local, 6090 bytes) 17/05/17 11:43:32 INFO executor:runningtask 4.0 in stage 0.0 (TID 4) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 3.0 in stage 0.0 (TID 3) in the "in-" Ms on localhost (executor driver) (4 /10) 17/05/17 11:43:32 INFO executor:finishedtask 4.0 in stage 0.0 (TID 4). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 5.0 in stage 0.0 (TID 5, localhost, E Xecutor driver, partition 5,process_local, 6090 bytes) 17/05/17 11:43:32 INFO TAsksetmanager:finished Task 4.0 in stage 0.0 (TID 4) in 102 MS on localhost (executor driver) (5/10) 17/05/17 11:43:32 INFO Executor:runningtask 5.0 in Stage 0.0 (TID 5) 17/05/17 11:43:32 INFO executor:finishedtask 5.0 in stage 0.0 (TID 5). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 6.0 in stage 0.0 (TID 6, localhost, E Xecutor driver, partition 6,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 5.0 in stage 0. 0 (TID 5) in 114 ms on localhost (executor driver) (6/10) 17/05/17 11:43:32 INFO executor:runningtask 6.0 in stage 0.0 (TI D 6) 17/05/17 11:43:32 INFO executor:finishedtask 6.0 in stage 0.0 (TID 6). 1114 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 7.0 in stage 0.0 (TID 7, localhost, E Xecutor driver, partition 7,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished Task 6.0 in stage 0. 0 (TID 6) in localhost (executor driver) (7/10) 17/05/17 11:43:Info Executor:runningtask 7.0 in stage 0.0 (TID 7) 17/05/17 11:43:32 INFO executor:finishedtask 7.0 in stage 0.0 (TID 7). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting task 8.0 in stage 0.0 (TID 8, Localho St, executor driver, partition 8,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished task 7.0 in STA GE 0.0 (TID 7) in 117 MS on localhost (executor driver) (8/10) 17/05/17 11:43:32 INFO executor:runningtask 8.0 in stage 0. 0 (TID 8) 17/05/17 11:43:32 INFO Executor:finishedtask 8.0 in stage 0.0 (TID 8). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:starting Task 9.0 in stage 0.0 (TID 9, localhost, E Xecutor driver, partition 9,process_local, 6090 bytes) 17/05/17 11:43:32 INFO tasksetmanager:finished task 8.0 in stage 0. 0 (TID 8) in-MS on localhost (executor driver) (9/10) 17/05/17 11:43:32 INFO executor:runningtask 9.0 in stage 0.0 (TI D 9) 17/05/17 11:43:32 INFO executor:finishedtask 9.0 in stage0.0 (TID 9). 1041 bytes result sent to driver 17/05/17 11:43:32 INFO tasksetmanager:finished task 9.0 into stage 0.0 (TID 9) in the MS on localhost (executor driver) (10/10) 17/05/17 11:43:32 INFO taskschedulerimpl:removed TaskSet 0.0, whose tasks have all comp Leted, from Pool 17/05/17 11:43:32 INFO dagscheduler:resultstage 0 (reduce on sparkpi.scala:38) finished in 2.589 s 17/05/
11:43:32 info dagscheduler:job 0finished:reduce at sparkpi.scala:38, took 3.388028 s Pi is roughly 3.1393111393111393 17/05/17 11:43:32 INFO Sparkui:stoppedspark Web UI at http://192.168.27.144:4040 17/05/17 11:43:32
Asterendpoint:mapoutputtrackermasterendpoint stopped! 17/05/17 11:43:32 Info memorystore:memorystore cleared 17/05/17 11:43:32 INFO Blockmanager:blockmanager stopped 17/05/17 11:43:32 INFO Blockmanagermaster:blockmanagermaster stopped 17/05/17 11:43:32 infooutputcommitcoordinator$
Outputcommitcoordinatorendpoint:outputcommitcoordinator stopped! 17/05/17 11:43:32 INFO SparkcoNtext:successfully stopped Sparkcontext 17/05/17 11:43:33 INFO shutdownhookmanager:shutdown Hook called 17/05/17 11:43:33 INFO shutdownhookmanager:deleting Directory/tmp/spark-c05c16db-967b-4f7c-91bd-61358c6e8fd7 [root@hserver1 spark-2.1.1-bin-hadoop2.7]#
4.2 start Spark shell command Line window Go to the home directory, which is to execute the following command:
cd/opt/spark/spark-2.1.1-bin-hadoop2.7
execute command, start script:
./bin/spark-shell
as shown in figure: