spout and bolts

Last Update:2017-01-10 Source: Internet

Author: User

Tags emit zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Storm, the core code is written using Clojure, the utility is developed using Python, and the Java development topology is used.

The storm cluster surface resembles a Hadoop cluster. But on Hadoop you run "MapReduce Jobs" and you Run "topologies" on storm. "Jobs" and "topologies" are very different, a key difference is that a mapreduce job will eventually end, and a topology will always handle the message (or until you kill it).
The storm cluster has two types of nodes: the control (master) node and the worker (worker) node.
The control node runs a daemon called "Nimbus", which resembles the "Jobtracker" of Haddop. Nimbus is responsible for distributing code within the cluster, assigning tasks to workers, and monitoring failures.
Each worker node runs a background program called "Supervisor". Supervisor listens to the work assigned to its machine and decides to start or stop worker processes based on what Nimbus assigns to it. Each worker process executes a subset of topology (that is, a sub-topological structure); A running topology consists of many worker processes that span multiple machines
A zookeeper cluster is responsible for all coordination between Nimbus and multiple supervisor (a full topology may be divided into multiple sub-topologies and completed by multiple supervisor).

In addition, both the Nimbus daemon and the Supervisor daemon are fast-failing (fail-fast) and stateless, and all States are maintained on zookeeper or local disks. This means that you can kill-9 kill the Nimbus process and the supervisor process, and then reboot, and they will resume their status and continue to work as if nothing had happened. This design makes storm extremely stable. In this design, master does not communicate directly with the worker, but instead uses a mediation zookeeper, which separates master and worker dependencies and stores state information in the zookeeper cluster to quickly reply to any failed party.

Nimbus main node, generally only one, supervisor as from the node, can have multiple;

The Nimbus node receives the request, shards the submitted topology, divides the task into a task and submits the information related to the supervisor to the zookeeper cluster, and supervisor goes to the zookeeper cluster to pick up its own task. Notifies its worker process to perform task processing.

The main methods of spout are:

Open (Map conf,topologycontext context,spoutoutputcollector collector) Close () nexttuple () Ack (Object msgId) fail ( Object msgId)

Open (): Initialization method

Close (): spout will be called when it is closed, but it is not guaranteed to be called, because the Supervisor node in the cluster can use kill-9 to kill the worker process, only the storm is running in local mode, if it is a send Stop command, is to ensure that close executes.

Declareoutputfields Method:

Declares the field name of the tuple to be output.

void Ack (Object msgid)

The method of the callback when a tuple is successfully processed, typically the implementation of this method is to remove messages from Message Queuing to prevent re-sending.

void Fail (Object msgid)

A callback method that handles a tuple failure, and typically the implementation of this method is to put the message back in the message queue and then re-send it later in time.

Nexttuple ()

The storm framework calls this method all the time, and the output is stepless to Outputcollector. This method should be non-blocking. Nexttuple,ack and fail are called in the same thread of the spout task.

public void Nexttuple () {This.collector.emit (new Values (Sentences[index])); index++;if (index >=sentences.length) { index=0;} Utils.sleep (1);}

Typically, implement a spout, you can directly implement Irichspout, or directly inherit baserichspout, you can write a little bit less code.

Bolt

Prepare () This method is similar to the Setup method in open () or mapper/reducer in spout, called when task is initialized, which provides the execution environment for the bolt.

void Cleanup () is called before closing, and does not guarantee that it will be executed.

The Execute () method receives a tuple and processes it and uses the prepare method to pass in the Outputcollector Ack method or fail to feed back the result.

Implement Bolt, you can implement Irichbolt interface or inherit Baserichbolt, if you do not want to process results feedback, can implement Ibasebolt interface or inheritance Basebasicbolt, it actually automatically implemented Collector.emit.ack (Inputtuple).

spout and bolts

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More