Storm records what--2--Storm is

Source: Internet
Author: User

  1. What Storm is:

    If you only use a word to describe storm, it could be this: distributed real-time computing systems. Storm's sense of real-time computing, according to storm authors, is similar to the meaning of Hadoop for batching. We all know that Hadoop, based on Google MapReduce, provides us with a map, the reduce primitive, which makes our batch process very simple and graceful. Storm is a real-time, distributed, and highly fault-tolerant computing system compared to the batch processing of Hadoop. Like Hadoop, Storm can handle large volumes of data, but storm can make processing more real-time with high reliability, which means that all information is processed. Storm can scale to large batches of data on different machines, and he has other features as well.

  2. Storm's architecture:

    Storm's cluster consists of a master node and multiple working nodes. The master node runs a daemon called "Nimbus", which is used to assign code, lay out tasks, and detect faults. Each work node runs a daemon called "Supervisor", which listens for work, starts and terminates the worker process. Both Nimbus and supervisor can fail quickly and are stateless, so they become very robust, and the coordination of the two works is done by zookeeper. The zookeeper is used to manage different components in the cluster. ZEROMQ is the internal messaging system, and JZMQ is the Java Binding for ZEROMQ. There is a subproject named Storm-deploy that can deploy a storm cluster on AWS with a single key.

  3. Storm Advantage:

    A. A simple programming model. Similar to mapreduce reduces the complexity of parallel batching, storm reduces the complexity of real-time processing.

    B. Service, a service framework that supports hot deployment, instant on-line or offline apps.

    C. You can use a variety of programming languages. You can use a variety of programming languages on top of storm. Clojure, Java, Ruby, and Python are supported by default. To increase support for other languages, you only need to implement a simple storm communication protocol.

    D. Fault tolerance. Storm manages worker processes and node failures.

    E. Horizontal expansion. Calculations are performed in parallel between multiple threads, processes, and servers.

    F. Reliable message processing. Storm guarantees that each message can be processed at least once. When a task fails, it retries the message from the message source.

    G. Fast. The design of the system ensures that the message can be processed quickly, using ZEROMQ as its underlying message queue.

    H. Local mode. Storm has a "local mode" that can fully simulate storm clusters during processing. This allows you to quickly develop and unit test.

  4. The problem with Storm:

    A, the current open source version is only a single node Nimbus, hanging off can only be automatically restarted, you can consider the implementation of a dual Nimbus layout.

    B, Clojure is a dynamic functional programming language running on the JVM platform, the advantage lies in the process calculation, the part of Storm's core content is written by Clojure, although the performance is improved a lot but also improve the maintenance cost.

  5. Storm's application scenario:

    Stream data processing. Storm can be used to handle incoming messages and write the results to a store after processing.

    Distributed RPC. Because storm's processing components are distributed and processing latencies are extremely low, they can be used as a common distributed RPC framework. Of course, in fact, our search engine itself is a distributed RPC system.


Storm records what--2--Storm is

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.