First, spark frame preview
Mainly have Core, GraphX, MLlib, spark streaming, spark SQL and so on several parts.
GRAPHX is a graph calculation and graph mining, in which the mainstream diagram calculation framework now has: Pregal, HAMA, giraph (these parts are in the form of hyper-step synchronization), and Graphlab and Spark graphx in an asynchronous manner. When it collaborates with spark SQL, it typically uses SQL statements for ETL (Extract-transform-load Data Warehouse technology) and then to GRAPHX processing.
The predecessor of Spark SQL was shark, and the shark predecessor, Hive,spark, was freed from the dependency on hive, absorbing many of the benefits of shark, such as the use of a memory-columnstore structure, a fast GC (a mechanism for recovering memory), and a compact storage structure Bytecode generation technology (CG) also accelerates queries and uses Scala code optimization. Spark SQL uses parse to generate the tree, and then performs binding optimizations on the tree, such as rule, which ultimately generates an executable physical plan.
Mllib is a machine learning library, machine learning is the core of artificial intelligence, the algorithm is mainly supervised, unsupervised, semi-supervised, enhanced learning, etc., can also be divided into: classification, clustering, regression (regression problem can also produce regularization, Bayesian, etc.), decision Tree, association rules, deep learning, artificial neural network, Deep Learning (developed by neural networks), dimensionality reduction, integration, etc.
Attached: Java memory space has heaps (stores new objects), stacks (local variables), static stores (variables that store constants and static declarations, usually allocated at compile time)
Second, Scala
1, partial function in partial application function?
Partial functions (patial function) is a mathematical concept that indicates that some values are not processed, while some application functions (patial applied function) refer to supplying less than the defined n parameters;
2. Closed Package
Actually closures mean: Code + non-local variables
3. High-order function
The high-order function is an important concept of Scala's functional programming, which is to return a function as a parameter.
Code instance
1 def mulby (factor:double) = (x:double) + factor * x23// Mulby can be produced Any two-number multiplication function 4 val quintuple = Mulby (5) // (x:double) = 5 * x5 quintuple () // 5 *
View Code
Three, semantic analysis
Semantic analysis is a kind of NLP, and NLP is a research direction of artificial intelligence, belonging to a cross discipline, often combined with big data can make a lot of intelligent applications.
2016.3.3 (Spark frame Preview, Scala part application functions, closures, higher order functions, some insights on semantic analysis)