First time, attend the technology salon hahaha haha
Outline of the first Bull man's speech
1. Figure Calculation
2.Tungsten
3. Recommendations
A graph is stored with a matrix in the computer that identifies the attributes of each point and edge in the matrix.
Finding the critical path in the diagram requires a graph calculation, which is much faster than Hadoop. Mainly because each iteration of the graph calculates some information (points and lines). Hadoop is all in operation. So relative to some diagram of the problem, or graph calculation is relatively fast.
Several frames of graph calculation
Giraph (open source), Graphlab (open source, fast), Goolgepregel (not open source)
Application of Graph calculation:
PageRank graph, weight
User_item Graphs
Triangle Counting Triangle Calculation
Social NetWorks Social Network
Tungsten is faster and is automatically referenced in Dateset. Because of the use of Sun.misc.unsafe in Java. Without using the JVM's garbage collection strategy, the computation speed is greatly increased when you control it yourself.
For example: the string "ABCD" should normally be 4 bytes, but if it is the object's representation Add object Head12 Byte, and then added something else, altogether 24 bytes. Save in Java with unicod, then 24*2=48 bytes.
This is the difference between using the JVM, in Gctime
Using the BSP model in the diagram
Suggestions:
Not necessarily the distribution will be fast, network transmission data will have a lot of delay consumption. As long as the model in the big data calculation, the basic prototype to understand, not necessarily distributed to be efficient. The best efficient algorithm needs to be implemented by itself.
Big Data capabilities: Store, compute, query, dig
Pasal language is very important???? I don't know what language it is. Damn it
Go language
Scala language
Superman College Big Data Technology Salon