Analysis of the similarities and differences between MapReduce and Storm

Source: Internet
Author: User

[Author]: Kwu

Analysis of the similarities and differences between MapReduce and Storm

1. The difference between mapreduce and storm processing data:
MapReduce handles big data, batch processing, and data is relatively constant.

Storm: Streaming data, real-time processing, streaming data changes in real time.
Stream data for processing
1) for a single machine, multi-process, multi-threaded.
2) Multi-machine simultaneous multi-process, multi-threaded data processing (distributed)

2. Both MapReduce and Storm are phased
1) Map, reduce
2) Storm processing stage: spout, Bolt
3) Mr Runs will end, and Storm is a service that is equal to Tomcat.
4) Amount of data processed per unit of time, Mr Greater than storm
5) stream When the stream is calculated. Need real-time processing of data, seismic data, real-time data on e-commerce sites, recommendations, flights.
6) need to look at the results of each month and use Mr to process


3, the tuple is the basic unit of Storm data processing
To achieve the equivalent of Mr's KV key-value pairs

4, spout is the external interface of storm
Spout is the source of storm data input into storm processing.
After the data is handed, the bolt phase is processed.

Spout---bolt (unit of processing is tuple)

Packaged into topology by Oo
The concept of job in type Mr.



5. Related Configuration Files
Mapreduce:mapred-site.xml
Storm:Storm.yaml
Configuration items are sensitive to case and space

Analysis of the similarities and differences between MapReduce and Storm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.