[Author]: Kwu
Analysis of the similarities and differences between MapReduce and Storm
1. The difference between mapreduce and storm processing data:
MapReduce handles big data, batch processing, and data is relatively constant.
Storm: Streaming data, real-time processing, streaming data changes in real time.
Stream data for processing
1) for a single machine, multi-process, multi-threaded.
2) Multi-machine simultaneous multi-process, multi-threaded data processing (distributed)
2. Both MapReduce and Storm are phased
1) Map, reduce
2) Storm processing stage: spout, Bolt
3) Mr Runs will end, and Storm is a service that is equal to Tomcat.
4) Amount of data processed per unit of time, Mr Greater than storm
5) stream When the stream is calculated. Need real-time processing of data, seismic data, real-time data on e-commerce sites, recommendations, flights.
6) need to look at the results of each month and use Mr to process
3, the tuple is the basic unit of Storm data processing
To achieve the equivalent of Mr's KV key-value pairs
4, spout is the external interface of storm
Spout is the source of storm data input into storm processing.
After the data is handed, the bolt phase is processed.
Spout---bolt (unit of processing is tuple)
Packaged into topology by Oo
The concept of job in type Mr.
5. Related Configuration Files
Mapreduce:mapred-site.xml
Storm:Storm.yaml
Configuration items are sensitive to case and space
Analysis of the similarities and differences between MapReduce and Storm