Strom study Note One

Source: Internet
Author: User

---restore content starts---

Storm is a real-time, distributed and highly fault -Tolerant computing system . Like Hadoop, Storm can handle large volumes of data, but Storm can make processing more real-time with high reliability , which means that all information is processed. Storm also has fault tolerance and distributed computing features, which allows storm to scale to different machines for large batches of data processing. the similarities and differences between Storm and Hadoop1, Strom service has been opened, unless it is considered to be closed, no person will not stop,? 2, real-time: Storm delay low, storm data in memory, Hadoop data using disk as a swap medium.
3.storm delay low storm data in memory, network through balls, memory calculation, eliminating batch processing time. 4, Storm throughput is less than Hadoop. Not suitable for batch processing. The storm cluster consists primarily of a primary node and a group of worker nodes, which are coordinated through zookeeper.
Stormstructure diagram of the system: • Master node:
• The master node usually runs a background program--NimbusTo respond to nodes distributed across clusters, assign tasks, and monitor failures. It
very similar to the one in Hadoop.Job Tracker.
Working node:
• The work node also runs a daemon--supervisor that listens to work assignments and is based onrequire a worker process to run. Each
work nodes are implementations of a subset in topology. Coordination between Nimbus and supervisor is achieved through the zookeeper system or
Cluster of users.
Zookeeper
zookeeper is donecoordination between supervisor and Nimbus.service. While application-real-time logic is encapsulated into storm
"Topology" in the. Topology is a set of spouts (data sources) and bolts (data manipulation) via stream
groupings the diagram for the connection. The following is a more profound explanation of the terms that appear.
Spout:
• In short, spout reads data from the source and puts it into topology. Spout is divided into reliable and unreliable two; when Storm receives fails
, a reliable spout will re-send a tuple (a list of tuples, data items), while an unreliable spout will not consider receiving
The work or not is only fired once. The most important method in Spout is Nexttuple (), which launches a new tuple to
Topology, if no new tuple is fired, it will simply return.
Bolt:
All processing in the topology is done by the bolt. Bolt can do anything, like: filtering, aggregating, accessing files/data
libraries, and so on. Bolt receives data from the spout and processes it, and may send a tuple to another bolt if it encounters complex stream processing
Be processed. The most important method in Bolt is execute (), which is received with the new tuple as a parameter. Whether it's spout or bolt,
If a tuple is fired into multiple streams, these flows can be declared by Declarestream (). topology– The package of computational logic – a diagram consisting of spouts and bolts that connects spouts and bolts in the diagram via stream grouping





---restore content ends---

Strom study Note One

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.