Listen to the sixth lesson of Liaoliang tonight. Proficient in spark cluster building and testing, after-school work is: Build your own spark environment and successfully run PI, my summary is as follows:
1 Hardware environment:
At least 8GB memory, recommended Kingston memory, virtual machine recommended Ubuntu Kylin version, can install various Office software including Sogou input method.
Internet access: Nat,root permissions login, avoid permissions issues
2. Software Environment:
RedHat 6.4 Spark 1.6.0 Hadoop 2.6.0 Scala 2.11.8
3/etc/hosts Ip-hostname Correspondence Relationship
spark.eventLog.enabled true records spark run events for ease of operation
./start-history-server.sh Start
spark://master:7077 Default Port
4 Pi Program
Object SPARKPI {
def main (args:array[string]) {
Val conf = new sparkconf (). Setappname ("Spark Pi")
Val spark = new Sparkcontext (conf)
Val slices = if (args.length > 0) args (0). ToInt Else 2
Val n = math.min (100000L * slices, int.maxvalue). ToInt//Avoid overflow
Val count = spark.parallelize (1 until n, slices). map {i =
Val x = random * 2-1
Val y = random * 2-1
if (x*x + y*y < 1) 1 Else 0
}.reduce (_ + _)
println ("Pi is roughly" + 4.0 * count/n)
Spark.stop ()
}
}
5 Operation Result:
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://master:7077 \
./lib/spark-examples-1.6.0-hadoop2.6.0.jar \
1000
Follow-up courses can be referred to Sina Weibo Liaoliang _dt Big Data Dream Factory: Http://weibo.com/ilovepains
Spark3000 disciple Sixth Master Spark cluster Building summary