Storm basic knowledge

Source: Internet
Author: User
Tags hadoop mapreduce

In the previous article, I briefly introduced the origin of storm. Today I continue to learn some basic knowledge about storm and have a basic understanding of its usage. Fortunately, it is not too difficult, if you understand the hadoop mapreduce model, it is similar. First, I understand some concepts when I understand the storm prototype.

1. tuple (tuples) is the basic unit for message transmission. Fields in the tuples can be any type of objects. The tuples are used in the execute method of the bolt method mentioned later.

2. spout is the source of the topology stream and the component of the source data stream. The data source of the nozzle can be in several ways. (1) directly connect to the data source (2). read messages from the Message Queue (3). In DRPC mode, spout is equivalent to map in hadoop.

3. bolts, which process all logic in the topology. Bolt can complete connection operations, statistical analysis, and other logical operations. The execution method is the execute () method just mentioned.

4. topology (topology) is a real-time application running in storm. A complete spout-bolt is executed in a topology. The topology in storm is equivalent to the job in hadoop. The job in mapreduce is finally completed, and a topology will always run until it is killed. In Java, topologybuilder is used to build the topology.

5. Nimbus process and supervisor process. The master node runs the nimbus daemon. Allocates node tasks and monitors host faults, similar to hadoop's jobtracker. Each worker node runs the supervisor daemon and listens to assigned jobs on the host.

6. Worker (Worker Process), task (task), and executor (Executor) have close relationships and are directly indicated in a figure. The three relationships also control the concurrency of the storm topology to a certain extent. One or more worker processes can run in one or more topologies on a node.


This also indicates that tasks can be executed in parallel in storm. The following is an example of a very simple topology:

Public class simpletopology {public static void main (string [] ARGs) {try {// instantiate the topologybuilder class. Topologybuilder = new topologybuilder (); // you can specify the number of concurrent nodes and allocate the nodes. This number controls the number of threads of the object in the cluster. Topologybuilder. setspout ("simplespout", new simplespout (), 1); // set the data processing node and allocate the concurrency. Specify the policy that the node receives the eruption node as a random mode. Topologybuilder. setbolt ("simplebolt", new simplebolt (), 3 ). shufflegrouping ("simplespout"); config Config = new config (); config. setdebug (true); If (ARGs! = NULL & args. length> 0) {config. setnumworkers (1); stormsubmitter. submittopology (ARGs [0], config, topologybuilder. createtopology ();} else {// here is the startup code running in local mode. Config. setmaxtaskparallelism (1); localcluster cluster = new localcluster (); cluster. submittopology ("simple", config, topologybuilder. createtopology ();} catch (exception e) {e. printstacktrace ();}}}

The above example shows the topology submission example of the local test mode and the formal development mode. To compare the differences between hadoop and storm. I made the table:

Comparison items: hadoop storm

System role jobtracker Nimbus

Tasktracker Supervisor

Application name job Topology

Component Interface MAP/reduce spout/Bolt

The above is a small summary of my recent study of storm.

Storm basic knowledge

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.