Installing spark requires installing the JDK first and installing Scala.
1. Create a Directory
> Mkdir/opt/spark
> Cd/opt/spark
2. Unzip, create a soft connection
> Tar zxvf spark-2.3.0-bin-hadoop2.7.tgz
> Link-s spark-2.3.0-bin-hadoop2.7 Spark
4. Edit/etc/profile
> Vi/etc/profile
Enter the content below
Export Spark_home=/opt/spark/spark
Export path= $PATH: $SPARK _home/bin
> Source/etc/profile
5. Go to the Configuration folder
> cd/opt/spark/spark/conf
6. Configure spark-env.sh
> CP spark-env.sh.template spark-env.sh
Enter the following in the spark-env.sh
Export Scala_home=/opt/scala/scala
Export JAVA_HOME=/OPT/JAVA/JDK
Export Spark_home=/opt/spark/spark
Export Spark_master_ip=hserver1
Export SPARK_EXECUTOR_MEMORY=1G
Note: The above path should be configured according to your own path
7. Configure Slaves
> CP slaves.template Slaves
Enter the following in the slaves
localhost
8. Running the Spark sample
> Cd/opt/spark/spark
>./bin/run-example SPARKPI 10
The following information is displayed
[[email protected] spark]$./bin/run-example sparkpi 102018-06-04 22:37:25 WARN utils:66-your hostname, localhost. localdomain resolves to a loopback address:127.0.0.1; Using 192.168.199.150 instead (on interface wlp8s0b1) 2018-06-04 22:37:25 WARN utils:66-set spark_local_ip If you need t o bind to another address2018-06-04 22:37:25 WARN nativecodeloader:62-unable to load Native-hadoop library for your PLA Tform. Using Builtin-java classes where applicable2018-06-04 22:37:25 INFO sparkcontext:54-running Spark version 2.3. 02018-06-04 22:37:25 Info sparkcontext:54-submitted application:spark pi2018-06-04 22:37:26 Info SecurityManager:54- Changing view ACLs to:aston2018-06-04 22:37:26 INFO securitymanager:54-changing Modify ACLs to:aston2018-06-04 22:37 : + info securitymanager:54-changing View ACLS groups to:2018-06-04 22:37:26 info securitymanager:54-changing modif Y ACLs groups to:2018-06-04 22:37:26 INFO securitymanager:54-securitymanager:authentication disabled; UI ACLs Disabled; Users with View Permissions:set (Aston); Groups with view Permissions:set (); Users with modify Permissions:set (Aston); Groups with Modify Permissions:set () 2018-06-04 22:37:26 INFO utils:54-successfully started service ' Sparkdriver ' on PO RT 34729.2018-06-04 22:37:26 Info sparkenv:54-registering mapoutputtracker2018-06-04 22:37:26 Info Sparkenv:54-regis Tering blockmanagermaster2018-06-04 22:37:26 INFO blockmanagermasterendpoint:54-using Org.apache.spark.storage.DefaultTopologyMapper for getting topology information2018-06-04 22:37:26 INFO Blockmanagermasterendpoint:54-blockmanagermasterendpoint up2018-06-04 22:37:26 INFO diskblockmanager:54-created Local directory at/tmp/blockmgr-4d51d515-85db-4a8c-bb45-219fd96be3c62018-06-04 22:37:26 INFO memorystore:54- Memorystore started with capacity 366.3 mb2018-06-04 22:37:26 INFO sparkenv:54-registering outputcommitcoordinator2018- 06-04 22:37:26 INFO log:192-logging initialized @2296ms2018-06-04 22:37:26 Info server:346-jetty-9.3.z-snapshot2018-06-04 22:37:26 Info server:414-started @2382ms2018- 06-04 22:37:26 INFO abstractconnector:278-started [email protected]{http/1.1,[http/1.1]}{0.0.0.0:4040} 2018-06-04 22:37:26 Info utils:54-successfully started service ' Sparkui ' on port 4040.2018-06-04 22:37:26 INFO Context handler:781-started [Email protected]{/jobs,null,available, @Spark}2018-06-04 22:37:26 INFO ContextHandler:781 -Started [Email protected]{/jobs/json,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781- Started [Email protected]{/jobs/job,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected] {/jobs/job/json,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/ Stages,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/stages/ Json,null,available, @Spark}2018-06-04 22:37: + INFO contexthandler:781-started [email protected]{/stages/stage,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/stages/stage/json,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/stages/pool,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/stages/pool/json,null,available, @Spark} 2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/storage,null,available, @Spark} 2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/storage/json,null,available, @Spark} 2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/storage/rdd,null,available, @Spark} 2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected]{/storage/rdd/json,null,available, @Spark }2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected]{/environment,null,avAilable, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected]{/environment/json,null, AVAILABLE, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected]{/executors,null, AVAILABLE, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected]{/executors/json,null, AVAILABLE, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected]{/executors/threaddump, Null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/executors/ Threaddump/json,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/ Static,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [Email protected]{/,null, AVAILABLE, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/api,null,available,@ spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{/jobs/job/kill,null,available, @Spark}2018-06-04 22:37:26 INFO contexthandler:781-started [email protected]{ /stages/stage/kill,null,available, @Spark}2018-06-04 22:37:26 INFO sparkui:54-bound Sparkui to 0.0.0.0, and started at H ttp://192.168.199.150:40402018-06-04 22:37:26 INFO sparkcontext:54-added JAR file:///opt/spark/spark/examples/jars/ Spark-examples_2.11-2.3.0.jar at Spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 15281230467792018-06-04 22:37:26 INFO sparkcontext:54-added JAR file:///opt/spark/spark/examples/jars/scopt_ 2.11-3.7.0.jar at Spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 15281230467802018-06-04 22:37:26 Info executor:54-starting Executor ID driver on host localhost2018-06-04 22:37:26 INFO utils:54-successfull Y started service ' org.apache.spark.network.netty.NettyBlockTransferService ' on port 45436.2018-06-04 22:37:26 INFO Nettyblocktransferservice:54-server created on 192.168.199.150:454362018-06-04 22:37:26 INFO blockmanager:54-using org.apache.spark.storage.RandomBlockReplicationPolicy for block Replication policy2018-06-04 22:37:26 INFO blockmanagermaster:54-registering blockmanager blockmanagerid (Driver, 192.168.199.150, 45436, None) 2018-06-04 22:37:26 INFO blockmanagermasterendpoint:54-registering block manager 192.168. 199.150:45436 with 366.3 MB RAM, Blockmanagerid (Driver, 192.168.199.150, 45436, None) 2018-06-04 22:37:26 INFO blockmanage rmaster:54-registered Blockmanager Blockmanagerid (Driver, 192.168.199.150, 45436, None) 2018-06-04 22:37:26 INFO blockm Anager:54-initialized Blockmanager:blockmanagerid (Driver, 192.168.199.150, 45436, None) 2018-06-04 22:37:27 INFO Conte xthandler:781-started [Email protected]{/metrics/json,null,available, @Spark}2018-06-04 22:37:27 INFO Sparkcontext:54-starting job:reduce at sparkpi.scala:382018-06-04 22:37:27 INFO dagscheduler:54-got Job 0 (reduce at sparkpi.scala:38) with the output partitions2018-06-04 22:37:27 Info dagscheduler:54-final stage:resultstage 0 (reduce at sparkpi.scala:38) 2018-06-04 22:37:27 INFO DAGS Cheduler:54-parents of Final stage:list () 2018-06-04 22:37:27 INFO dagscheduler:54-missing parents:list () 2018-06-04 22:37:27 INFO dagscheduler:54-submitting resultstage 0 (mappartitionsrdd[1] at map at Sparkpi.scala:34), which have no M Issing parents2018-06-04 22:37:27 INFO memorystore:54-block broadcast_0 stored as values in memory (estimated size 1832 .0 B, free 366.3 MB) 2018-06-04 22:37:28 INFO memorystore:54-block broadcast_0_piece0 stored as bytes in memory (Estimat Ed size 1181.0 B, free 366.3 MB) 2018-06-04 22:37:28 INFO blockmanagerinfo:54-added broadcast_0_piece0 in memory on 192. 168.199.150:45436 (size:1181.0 B, free:366.3 MB) 2018-06-04 22:37:28 INFO sparkcontext:54-created broadcast 0 from Bro Adcast at dagscheduler.scala:10392018-06-04 22:37:28 INFO dagscheduler:54-submitting missing tasks from Resultstage 0 (Mappartitionsrdd[1] AT map at Sparkpi.scala:34) (first tasks is for partitions Vector (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)) 2018-06-04 22:37:28 INF O taskschedulerimpl:54-adding Task set 0.0 with tasks2018-06-04 22:37:28 INFO tasksetmanager:54-starting task 0.0 In stage 0.0 (TID 0, localhost, executor driver, partition 0, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO TaskSet Manager:54-starting Task 1.0 in Stage 0.0 (TID 1, localhost, executor driver, partition 1, process_local, 7853 bytes) 201 8-06-04 22:37:28 INFO tasksetmanager:54-starting Task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO tasksetmanager:54-starting Task 3.0 in stage 0.0 (TID 3, localhost, Executor driver, Partition 3, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO executor:54-running Task 2.0 in stage 0.0 (tid 2) 2018-06-04 22:37:28 Info executor:54-running Task 1.0 in Stage 0.0 (tid 1) 2018-06-04 22:37:28 INFO Executo R:54-running Task 3.0In stage 0.0 (tid 3) 2018-06-04 22:37:28 Info executor:54-running task 0.0 in stage 0.0 (tid 0) 2018-06-04 22:37:28 INFO Executor:54-fetching Spark://192.168.199.150:34729/jars/scopt_2.11-3.7.0.jar with timestamp 15281230467802018-06-04 22:37:28 INFO transportclientfactory:267-successfully created connection to/ 192.168.199.150:34729 after-MS (0 Ms spent in bootstraps) 2018-06-04 22:37:28 INFO utils:54-fetching spark://192.168. 199.150:34729/jars/scopt_2.11-3.7.0.jar to/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/ userfiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchfiletemp8606784681518533462.tmp2018-06-04 22:37:28 INFO Executor:54-adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/ Userfiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/scopt_2.11-3.7.0.jar to class loader2018-06-04 22:37:28 INFO Executor : 54-fetching Spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar with timestamp 15281230467792018-06-04 22:37:28 INFO utils:54-fetching spark://192.168.199.150:34729/jars/spark-examples_2.11-2.3.0.jar to/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/ userfiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/fetchfiletemp8435156876449095794.tmp2018-06-04 22:37:28 INFO Executor:54-adding file:/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e/ Userfiles-36ae13de-60e8-42fd-958d-66c3c3832d4a/spark-examples_2.11-2.3.0.jar to Class loader2018-06-04 22:37:28 INFO executor:54-finished task 0.0 in stage 0.0 (TID 0). 824 bytes result sent to driver2018-06-04 22:37:28 INFO executor:54-finished Task 1.0 in Stage 0.0 (TID 1). 824 bytes result sent to driver2018-06-04 22:37:28 INFO tasksetmanager:54-starting Task 4.0 in stage 0.0 (TID 4, Localh OST, executor driver, Partition 4, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO executor:54-finished Task 2.0 in Stage 0.0 (TID 2). 867 bytes result sent to driver2018-06-04 22:37:28 INFO executor:54-running Task 4.0 in stage 0.0 (TID 4) 2018-06-04 22: 37:28 INFO tasksetmanager:54-starting Task 5.0 in Stage 0.0 (TID 5, localhost, executor driver, Partition 5, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO Exe Cutor:54-running Task 5.0 in stage 0.0 (TID 5) 2018-06-04 22:37:28 INFO executor:54-finished Task 3.0 in stage 0.0 (TI D 3). 824 bytes result sent to driver2018-06-04 22:37:28 INFO tasksetmanager:54-starting Task 6.0 in stage 0.0 (TID 6, Localh OST, executor driver, partition 6, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO tasksetmanager:54-starting Task 7 .0 in stage 0.0 (TID 7, localhost, executor driver, Partition 7, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO Execu Tor:54-running Task 7.0 in stage 0.0 (tid 7) 2018-06-04 22:37:28 INFO executor:54-running Task 6.0 in stage 0.0 (TID 6 ) 2018-06-04 22:37:28 INFO tasksetmanager:54-finished Task 1.0 in Stage 0.0 (TID 1) inch 362 ms on localhost (executor dri ver) (1/10) 2018-06-04 22:37:28 INFO tasksetmanager:54-finished Task 3.0 in stage 0.0 (TID 3) in 385 ms on localhost (ex Ecutor driver) (2/2018-06-04 22:37:28 INFO tasksetmanager:54-finished task 0.0 in stage 0.0 (TID 0) inch 418 ms on localhost (executor D River) (3/10) 2018-06-04 22:37:28 INFO tasksetmanager:54-finished Task 2.0 in stage 0.0 (TID 2) in 388 MS on localhost ( Executor driver) (4/10) 2018-06-04 22:37:28 INFO executor:54-finished Task 5.0 in stage 0.0 (TID 5). 824 bytes result sent to driver2018-06-04 22:37:28 INFO tasksetmanager:54-starting task 8.0 in stage 0.0 (TID 8, Localh OST, executor driver, Partition 8, process_local, 7853 bytes) 2018-06-04 22:37:28 INFO tasksetmanager:54-finished Task 5 .0 in the stage 0.0 (TID 5) in the MS on localhost (executor driver) (5/10) 2018-06-04 22:37:28 INFO executor:54-running Task 8.0 in stage 0.0 (tid 8) 2018-06-04 22:37:28 INFO executor:54-finished Task 4.0 in stage 0.0 (tid 4). 824 bytes result sent to driver2018-06-04 22:37:28 INFO tasksetmanager:54-starting Task 9.0 in stage 0.0 (TID 9, Localh OST, executor driver, Partition 9, process_local, 7853 bytes) 2018-06-04 22:37:28 Info executor:54-running Task 9.0 in stage 0.0 (TID 9) 2018-06-04 22:37:28 INFO tasksetmanager:54 -Finished task 4.0 in stage 0.0 (TID 4) in the MS on localhost (executor driver) (6/10) 2018-06-04 22:37:28 INFO Executor: 54-finished Task 7.0 in stage 0.0 (TID 7). 824 bytes result sent to driver2018-06-04 22:37:28 INFO tasksetmanager:54-finished Task 7.0 on stage 0.0 (TID 7) in 98 Ms on localhost (executor driver) (7/10) 2018-06-04 22:37:28 INFO executor:54-finished Task 6.0 in stage 0.0 (TID 6). 824 bytes result sent to driver2018-06-04 22:37:28 INFO tasksetmanager:54-finished Task 6.0 on stage 0.0 (TID 6) in 107 Ms on localhost (executor driver) (8/10) 2018-06-04 22:37:28 INFO executor:54-finished Task 9.0 in stage 0.0 (TID 9). 824 bytes result sent to driver2018-06-04 22:37:28 INFO executor:54-finished task 8.0 in stage 0.0 (TID 8). 867 bytes result sent to driver2018-06-04 22:37:28 INFO tasksetmanager:54-finished Task 9.0 in stage 0.0 (TID 9) In-A-MS on localhost (executor driver) (9/10) 2018-06-04 22:37:28 INFO tasksetmanager:54-finished task 8.0 in stage 0. 0 (TID 8) in the MS on localhost (executor driver) (10/10) 2018-06-04 22:37:28 INFO taskschedulerimpl:54-removed TaskSet 0.0, whose tasks has all completed, from pool 2018-06-04 22:37:28 INFO dagscheduler:54-resultstage 0 (reduce at SPARKP i.scala:38) finished in 0.800 s2018-06-04 22:37:28 INFO dagscheduler:54-job 0 finished:reduce at sparkpi.scala:38, too K 0.945853 SPi is roughly 3.140239140239142018-06-04 22:37:28 INFO abstractconnector:318-stopped [email protected] {http/1.1,[http/1.1]} {0.0.0.0:4040}2018-06-04 22:37:28 INFO sparkui:54-stopped Spark Web UI at http://192.168.199.150:40402018-06-04 22:37:2 8 Info mapoutputtrackermasterendpoint:54-mapoutputtrackermasterendpoint stopped!2018-06-04 22:37:28 Info MemoryStore : 54-memorystore cleared2018-06-04 22:37:28 Info blockmanager:54-blockmanager stopped2018-06-04 22:37:28 Info BlockMa NagerMaster:54-blockmanagermaster stopped2018-06-04 22:37:28 INFO outputcommitcoordinator$ Outputcommitcoordinatorendpoint:54-outputcommitcoordinator stopped!2018-06-04 22:37:28 INFO SparkContext:54- Successfully stopped sparkcontext2018-06-04 22:37:28 INFO shutdownhookmanager:54-shutdown Hook called2018-06-04 22:37: INFO shutdownhookmanager:54-deleting directory/tmp/spark-a840c54e-7db9-4dfc-a446-1fa10a8d2c3e2018-06-04 22:37:28 INFO shutdownhookmanager:54-deleting directory/tmp/spark-16300765-9872-4542-91ed-1a7a0f8285d9
9. Run the Spark shell
> Cd/opt/spark/spark
>./bin/spark-shell
Linux standalone Switch spark