After the Hadoop cluster is built, unzip the Spark file to
Spark installation package
Http://yunpan.cn/csPh8cf2n5WrT Extract code 1085
Spark Command - counts the number of readme.md file lines, and the search for the contained keywords, as well as the relevant commands in the first line of the file
Val lines = Sc.textfile ("readme.md")
?
Lines.count ()
?
Lines.first ()
?
Val pythonlines = lines.filter (line = Line.contains ("Python"))
?
Scala> Lines.first ()
res0:string = # Interactive Python Shel
Spark Command - add to Array
1 . Run ./spark-shell.sh
?
2. scala> val data = Array (1, 2, 3, 4, 5)// generate data
?
Data:array[int] = Array (1, 2, 3, 4, 5)
?
3. scala> val distdata = sc.parallelize (data)// processing data into an RDD
?
Distdata:spark. Rdd[int] = [email protected] (the type shown is RDD)
?
4. Scala> distdata.reduce (_+_)// operation on the RDD , add the elements in the data and
?
12/05/10 09:36:20 INFO Spark. Sparkcontext:starting job ...
?
5. Finally, the operation gets
?
12/05/10 09:36:20 INFO Spark. Sparkcontext:job finished in 0.076729174 s
?
Res2:int = 15
Spark command -wordcount
Val lines = Sc.textfile ("readme.md")
Val Count=lines.flatmap (line = Line.split ("")). Map (Word = = (word,1)). Reducebykey (_+_)
Count.collect ()
Spark Command - run a program that calculates Pi
/run-example Org.apache.spark.examples.SparkPi
Spark-Single node installation and operation