Figure out the differences between Spark, Storm, and MapReduce to learn big data.

Source: Internet
Author: User
Tags zookeeper hadoop ecosystem

Many beginners have a lot of doubts when it comes to big data, such as the understanding of the three computational frameworks of MapReduce, Storm, and Spark, which often creates confusion.

Which one is suitable for processing large amounts of data? Which is also suitable for real-time streaming data processing? And how do we differentiate them?

I've collated the basics of these 3 computational frameworks so that you can get an idea of the 3 computational frameworks as a whole.
Big Data Learning Group 119599574

Mapreduce
    • Distributed Offline Computing Framework

    • Mainly applicable to large-scale cluster task, because it is batch execution, so the timeliness is low.

    • Native support for Java language development MapReduce, other languages need to be developed using Hadoop streaming.

Spark
    • Spark is a fast and versatile computing engine designed for large-scale data processing, which is an iterative, memory-based computation.

    • Spark retains the benefits of MapReduce and has a significant increase in timeliness, providing good support for systems that require iterative calculations and high timeliness requirements.

    • Developers can write data analysis jobs in languages such as Java, Scala, or Python, and use more than 80 advanced operators.

    • Spark is fully compatible with HDFS while collaborating in parallel with other Hadoop components, including yarn and hbase.

    • Spark can be used to handle a variety of job types, such as real-time data analysis, machine learning, and graphics processing. It is used for recommendations and computing systems that can tolerate small delays.

Storm
    • Storm is a distributed, reliable and fault-tolerant streaming computing framework.

    • Storm was designed for real-time processing, so it is widely used in real time analysis/performance monitoring and other areas that require high timeliness.

    • Storm theoretically supports all languages and requires only a small amount of code to complete the adaptation.

    • Storm zookeeper the state of the cluster to a local disk, so background processes are stateless (no need to save their state, all on zookeeper) and can fail or restart without affecting the health of the system.

    • Storm can be applied to-data flow processing, continuous computing (continuously sending data to the client, which can be updated in real time and presenting data such as site metrics), distributed remote Procedure calls (easy parallelization of CPU-intensive operations).

How to use 4 months to learn Hadoop development and find a yearly salary of 250,000 jobs?

Share a free 18 of the latest Hadoop Big data tutorials and 100 Hadoop Big Data Mandatory meeting questions.

Big Data Learning Group 119599574

Tutorials have helped 300 + people to successfully transform Hadoop development, with 90% starting salaries of over 20K and a doubling of wages than before.

Recorded by Baidu's core architect of Hadoop (T7 level).

The content includes 0 basic primer, Hadoop ecosystem, real Business Project combat 3 most. The business case allows you to get in touch with the real production environment and train your own development skills.

Figure out the differences between Spark, Storm, and MapReduce to learn big data.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.