Create a spark cluster (computing level) for the zybo cluster ):
1. Each node has the same filesystem and MAC address conflict, so:
Vi./etc/profile
Export Path =/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin: $ path
Export java_home =/usr/lib/jdk1.7.0 _ 55
Export classpath =.: $ java_home/lib/tools. Jar
Export Path = $ java_home/bin: $ path
Export hadoop_home =/root/hadoop-2.4.0
Ifconfig eth1 HW ether 00: 0a: 35: 00: 01: 03
Ifconfig eth1 192.168.1.3/24 up
2. Generate the private key id_rsa and Public Key id_rsa.pub configuration file
Ssh-keygen-T RSA
Id_rsa is the key file and id_rsa.pub is the public key file.
3. Worker node/etc/hosts Configuration:
Procedure:
SSH [email protected]
VI/etc/hosts
127.0.0.1 localhost zynq
192.168.1.1 spark1
192.168.1.2 spark2
192.168.1.3 spark3
192.168.1.4 spark4
192.168.1.5 spark5
192.168.1.100 sparkmaster
#: 1 localhost ip6-localhost ip6-loopback
Master node/etc/hosts Configuration:
4. Distribution Public Key
Ssh-copy-ID-I ~ /. Ssh/id_rsa.pub [email protected]
Ssh-copy-ID-I ~ /. Ssh/id_rsa.pub [email protected]
Ssh-copy-ID-I ~ /. Ssh/id_rsa.pub [email protected]
Ssh-copy-ID-I ~ /. Ssh/id_rsa.pub [email protected]
.....
5. Configure the master node
Cd ~ /Spark-0.9.1-bin-hadoop2/Conf
VI slaves
6. Configure Java
Otherwise, the error count cannot be found (because pyspark cannot find javaruntime) occurs during PI calculation ).
CD/usr/bin/
Ln-S/usr/lib/jdk1.7.0 _ 55/bin/Java
Ln-S/usr/lib/jdk1.7.0 _ 55/bin/javac
Ln-S/usr/lib/jdk1.7.0 _ 55/bin/jar
7. Test and run all nodes
Spark_master_ip = 192.168.1.1./sbin/start-all.sh
Spark_master_ip = 192.168.1.100./sbin/start-all.sh
Start all nodes successfully:
8. view the working status:
JPS
Netstat-ntlp
9. Enable script command line
Master = spark: // 192.168.1.1: 7077./bin/pyspark
Master = spark: // 192.168.1.100: 7077./bin/pyspark
10. Test
From Random import random
Def sample (P ):
X, Y = random (), random ()
Return 1 if x * x + y * Y <1 else 0
Count = SC. parallelize (xrange (0, 1000000). Map (sample )\
. Reduce (lambda A, B: A + B)
Print "Pi is roughly % F" % (4.0 * count/1000000)
Operation successful:
Normal startup information:
[Email protected]: ~ /Spark-0.9.1-bin-hadoop2 # Master = spark: // 192.168.1.1: 7077./bin/pyspark
Python 2.7.4 (default, Apr 19 2013, 19:49:55)
[GCC 4.7.3] On linux2
Type "help", "Copyright", "Credits" or "License" for more information.
Log4j: warn no appenders cocould be found for logger (akka. event. slf4j. slf4jlogger ).
Log4j: Warn please initialize the log4j system properly.
Log4j: Warn see http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
70/01/01 00:07:48 info sparkenv: using spark's default log4j profile: ORG/Apache/spark/log4j-defaults.properties
70/01/01 00:07:48 info sparkenv: Registering blockmanagermaster
70/01/01 00:07:49 info diskblockmanager: created local directory at/tmp/spark-local-19700101000749-e1fb
70/01/01 00:07:49 info memorystore: memorystore started with capacity 297.0 MB.
70/01/01 00:07:49 info connectionmanager: bound socket to port 36414 with ID = connectionmanagerid (spark1, 36414)
70/01/01 00:07:49 info blockmanagermaster: trying to register blockmanager
70/01/01 00:07:49 info blockmanagermasteractor $ blockmanagerinfo: Registering block manager spark1: 36414 with 297.0 MB RAM
70/01/01 00:07:49 info blockmanagermaster: Registered blockmanager
70/01/01 00:07:49 info httpserver: Starting HTTP Server
70/01/01 00:07:50 info httpbroadcast: Broadcast Server started at http: // 192.168.1.1: 42068
70/01/01 00:07:50 info sparkenv: Registering mapoutputtracker
70/01/01 00:07:50 info httpfileserver: Http File Server Directory is/tmp/spark-77996902-7ea4-4161-bc23-9f3538967c17
70/01/01 00:07:50 info httpserver: Starting HTTP Server
70/01/01 00:07:51 info sparkui: started spark web UI at http: // spark1: 4040
70/01/01 00:07:52 info appclient $ clientactor: connecting to master spark: // 192.168.1.1: 7077...
70/01/01 00:07:55 info sparkdeployschedulerbackend: connected to spark cluster with app ID app-19700101000755-0001
70/01/01 00:07:55 info appclient $ clientactor: Executor added: app-19700101000755-0001/0 on worker-19700101000249-spark2-53901 (spark2: 53901) with 2 cores
70/01/01 00:07:55 info sparkdeployschedulerbackend: granted executor ID app-19700101000755-0001/0 on hostport spark2: 53901 with 2 cores, 512.0 MB RAM
70/01/01 00:07:55 info appclient $ clientactor: Executor added: app-19700101000755-0001/1 on worker-19700101000306-spark5-38532 (spark5: 38532) with 2 cores
70/01/01 00:07:55 info sparkdeployschedulerbackend: granted executor ID app-19700101000755-0001/1 on hostport spark5: 38532 with 2 cores, 512.0 MB RAM
70/01/01 00:07:55 info appclient $ clientactor: Executor added: app-19700101000755-0001/2 on worker-19700101000255-spark3-41536 (spark3: 41536) with 2 cores
70/01/01 00:07:55 info sparkdeployschedulerbackend: granted executor ID app-19700101000755-0001/2 on hostport spark3: 41536 with 2 cores, 512.0 MB RAM
70/01/01 00:07:55 info appclient $ clientactor: Executor added: app-19700101000755-0001/3 on worker-19700101000254-spark4-38766 (spark4: 38766) with 2 cores
70/01/01 00:07:55 info sparkdeployschedulerbackend: granted executor ID app-19700101000755-0001/3 on hostport spark4: 38766 with 2 cores, 512.0 MB RAM
70/01/01 00:07:55 info appclient $ clientactor: Executor updated: app-19700101000755-0001/0 is now running
70/01/01 00:07:55 info appclient $ clientactor: Executor updated: app-19700101000755-0001/3 is now running
70/01/01 00:07:55 info appclient $ clientactor: Executor updated: app-19700101000755-0001/1 is now running
70/01/01 00:07:55 info appclient $ clientactor: Executor updated: app-19700101000755-0001/2 is now running
70/01/01 00:07:56 warn nativecodeloader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
Welcome
______
/__/__________//__
_\\/_\/_'/__/'_/
/_/. _/\ _, _/\ _ \ Version 0.9.1
/_/
Using Python version 2.7.4 (default, Apr 19 2013 19:49:55)
Spark context available as SC.
>>> 70/01/01 00:08:06 info sparkdeployschedulerbackend: Registered executor: Actor [akka. TCP: // [email protected]: 35842/user/executor #1876589543] with ID 2
70/01/01 00:08:11 info blockmanagermasteractor $ blockmanagerinfo: Registering block manager spark3: 42847 with 297.0 MB RAM
70/01/01 00:08:12 info sparkdeployschedulerbackend: Registered executor: Actor [akka. TCP: // [email protected]: 43445/user/executor #-1199017431] with ID 1
70/01/01 00:08:13 info blockmanagermasteractor $ blockmanagerinfo: Registering block manager spark5: 42630 with 297.0 MB RAM
70/01/01 00:08:15 info appclient $ clientactor: Executor updated: app-19700101000755-0001/0 is now failed (command exited with code 1)
70/01/01 00:08:15 info sparkdeployschedulerbackend: Executor app-19700101000755-0001/0 removed: Command exited with code 1
70/01/01 00:08:15 info appclient $ clientactor: Executor added: app-19700101000755-0001/4 on worker-19700101000249-spark2-53901 (spark2: 53901) with 2 cores
70/01/01 00:08:15 info sparkdeployschedulerbackend: granted executor ID app-19700101000755-0001/4 on hostport spark2: 53901 with 2 cores, 512.0 MB RAM
70/01/01 00:08:15 info appclient $ clientactor: Executor updated: app-19700101000755-0001/4 is now running
70/01/01 00:08:21 info sparkdeployschedulerbackend: Registered executor: Actor [akka. TCP: // [email protected]: 41692/user/executor #-1994427913] with ID 3
70/01/01 00:08:26 info blockmanagermasteractor $ blockmanagerinfo: Registering block manager spark4: 49788 with 297.0 MB RAM
70/01/01 00:08:27 info sparkdeployschedulerbackend: Registered executor: Actor [akka. TCP: // [email protected]: 44449/user/executor #-1155287434] with ID 4
70/01/01 00:08:28 info blockmanagermasteractor $ blockmanagerinfo: Registering block manager spark2: 38675 with 297.0 MB RAM