Reprinted self-Knowledge: https://www.zhihu.com/question/26568496
1) MapReduce: is an off-line computing framework that abstracts an algorithm into a map and reduce two phases
Processing, which is ideal for data-intensive computing.
2) The Spark:mapreduce Computing framework is not suitable for iterative computation and interactive computing, and MapReduce is a disk
Computing framework, and Spark is a memory-computing framework that puts data into memory as much as possible to improve iterations
Computational efficiency for application and interactive applications.
3) Storm:mapreduce is also not suitable for streaming calculation, real-time analysis, such as ad click Calculation, and
Storm is better at this kind of computation, and it is much better in real time than the MapReduce computational framework.
4) Tez: A computational framework that runs on top of yarn to support DAG operations, and summarizes the data processing of mapreduce. It
Split the map/reduce process into several sub-processes while simultaneously combining multiple map/reduce tasks into one
The larger DAG task reduces the file storage between map/reduce. At the same time, reasonable combination of its sub-processes, can also
To reduce the run time of the task.
Big Data Two: the analysis of Hadoop and Spark