Spark Study Notes
Post please declare Original: http://blog.csdn.net/duck_genuine/article/details/40506715
Test results of join and Union methods
Join (otherdataset, [numtasks]) :( K, v) join (K, W) => (K, (V, W ))
If no join key exists, no data exists, that is, two RDDs do not have a common K, and no corresponding data exists.
For example:
Res15: array [(INT, INT)] = array (1, 2), (2, 3), (3, 4 ))
Res16: array [(INT, INT)] = array (1, 2), (2, 3), (4, 5 ))
The join results of the two lists are as follows:
Res17: array [(INT, (INT, INT)] = array (1, (2, 2), (2, (3, 3 )))
Union (otherdataset) returns a new dataset, which is composed of the original dataset and parameters.
The Union results of the two lists are as follows:
Res18: array [(INT, INT)] = array (1, 2), (2, 3), (3, 4), (1, 2), (2, 3), (4, 5 ))
Map not tested for the moment
Spark example
Https://github.com/apache/spark/tree/master/examples/src/main/scala/org/apache/spark/examples
Xgraph graph http://spark.apache.org/docs/latest/graphx-programming-guide.html#migrating-from-spark-091
Spark streaming computing
Learning Materials
Http://shiyanjun.cn/archives/744.html
Http://fossies.org/linux/spark/core/src/test/java/org/apache/spark/JavaAPISuite.java
Spark Study Notes