How Twitter storm ensures that messages are not lost

Source: Internet
Author: User

Turn from:Http://xumingming.sinaapp.com/127/twitter-storm How to ensure that messages are not lost/

Storm guarantees that every tuple sent from spout is fully processed. This article describes how storm can guarantee this, and what we users do to make the most of storm's reliability.

What does a tuple mean to be "fully processed"?

Just like the butterfly effect, a tuple emitted from spout can cause thousands of other tuples to be created, thinking about the topology that calculates the number of occurrences of each word in an article.

Help
123456789 TopologyBuilder builder = new TopologyBuilder();builder.setSpout(1, new KestrelSpout("kestrel.backtype.com",                                     22133,                                     "sentence_queue",                                     new StringScheme()));builder.setBolt(2, new SplitSentence(), 10)        .shuffleGrouping(1);builder.setBolt(3, new WordCount(), 20)        .fieldsGrouping(2, new Fields("word"));

This topology reads the sentence from a Kestrel queue, divides each sentence into words, and then launches the word: a source tuple (a sentence) causes the generation of many of the subsequent tuples (words), which is probably the message flow:

A tuple tree that counts the number of occurrences of a word

In storm, a tuple is fully processed to mean that the tuple and all the tuples that are caused by this tuple are successfully processed. A tuple is considered to be processing failure if the message was not successfully processed within the time specified by timeout. and this timetout can be specified by Config.topology_message_timeout_secs.

What happens if a message is processed successfully or fails?

FYI. The following is the interface to be implemented by spout:

Help
12345678 public interface ISpout extends Serializable {    void open(Map conf, TopologyContext context,              SpoutOutputCollector collector);    void close();    void nextTuple();    void ack(Object msgId);    void fail(Object msgId);}

First Storm gets the next tuple by calling Spout's Nexttuple method, spout emits a new tuple to one of its output message flows through the spoutoutputcollector provided inside the open method parameter, When a tuple is launched, spout will provide a message-id, followed by this message-id to trace the tuple. For example, Kestrelspout reads a message from the Kestrel queue and takes the message ID provided by Kestrel as a message-id, see the example:

Help
1 _collector.emit(newValues("field1","field2", 3),msgId);

Next, the sent tuple is routed to the message processor bolt, and storm tracks the resulting tuple tree. If Storm detects that a tuple is fully processed, Storm will invoke the Ack method of the message source with the first Message-id as the parameter, whereas storm will call the spout's Fail method. It is important to note that the task that storm calls ACK or fail is always the one that produces the tuple. So if a spout is divided into a number of tasks to execute, the successful failure of the message execution always notifies the task that started the tuple.

Let's take kestrelspout as an example to see what spout needs to do to ensure that "a message is always fully processed" and when Kestrelspout reads a message from Kestrel, first it "opens" the message, This means that the message is still in the Kestrel queue, but the message will be labeled "processed" until the ACK or fail is invoked. Messages in the "in process" state are not sent to other message handlers, and if the spout is "disconnected", all messages in the "in Process" state are re-labeled as "waiting to be processed".

Storm's Reliability API

As a user of storm, there are two things to do to make better use of Storm's reliability features. First, notify Storm when you generate a new tuple; Next, notify Storm when you are finished processing a tuple. This allows storm to detect that the entire tuple tree has not finished processing and notifies the source spout to process the results. Storm provides a few simple APIs to do these things.

The generation of a new tuple by a tuple is called: anchoring. when you launch a new tuple, you also complete a anchoring. Take a look at this example: this bolt divides a tuple containing a sentence into a tuple of each word.

Help
010203040506070809101112131415161718192021222324 public class SplitSentence implements IRichBolt {        OutputCollector _collector;        public void prepare(Map conf,                            TopologyContext context,                            OutputCollector collector) {            _collector = collector;        }        public void execute(Tuple tuple) {            String sentence = tuple.getString(0);            for(String word: sentence.split(" ")) {                _collector.emit(tuple, new Values(word));            }            _collector.ack(tuple);        }        public void cleanup() {        }         public void declareOutputFields(OutputFieldsDeclarer declarer) {            declarer.declare(new Fields("word"));        }    }

Take a look at this execute method, the first parameter of the emit is the input tuple, the second parameter is the output tuple, this is actually through the input tuple anchoring a new output tuple. Because this "word tuple" is anchoring in the "sentence tuple", if one of the words is processed in error, then the whole sentence will be re-processed. As a comparison, let's see what happens if you launch a new tuple using this line of code.

Help
1 _collector.emit(newValues(word));

The launch of this method will result in the new launch of this tuple from the original tuple tree (unanchoring), if the tuple processing fails, the entire sentence will not be re-processed. Whether you want to anchoring or unanchoring is entirely up to your business needs.

An output tuple can be anchoring to more than one input tuple. This is useful when the stream is merged or the stream is aggregated. If a multi-entry tuple fails, then all the input tuples corresponding to it will be re-executed. See below to demonstrate how to specify multiple input tuple:

Help
1234 List<Tuple> anchors = newArrayList<Tuple>();anchors.add(tuple1);anchors.add(tuple2);_collector.emit(anchors, newValues(1, 2, 3));

The multi-entry tuple adds the new tuple to a number of tuple trees.

We construct this tuple tree through anchoring, and the last thing to do is to tell Storm when you're done with a tuple, through the ACK and fail methods of the Outputcollector class, if you look back at SplitSentence the example, You can see that the "tuple of sentences" invokes an ACK after all the "word tuple" is emitted.

You can call OutputCollector the Fail method to immediately mark the tuple from the source of the message as fail, for example, you query the database, found an error, you can immediately fail the input tuple, so that the tuple can be quickly re-processed, Because you don't have to wait for that timeout time to let it fail automatically.

Each tuple you process must be either ACK or fail. Because storm tracks each tuple to occupy memory. So if you don't ack/fail every tuple, you'll end up seeing outofmemory errors.

Most bolts follow this rule: read a tuple, launch some new tuple, and ack the tuple at the end of execute. These bolts are often filters or simple functions. Storm has encapsulated a Basicbolt class for this type of law. If you do it with Basicbolt, the splitsentence above can be rewritten like this:

Help
010203040506070809101112131415161718192021 public class SplitSentence implements IBasicBolt {        public void prepare(Map conf,                            TopologyContext context) {        }        public void execute(Tuple tuple,                            BasicOutputCollector collector) {            String sentence = tuple.getString(0);            for(String word: sentence.split(" ")) {                collector.emit(new Values(word));            }        }        public void cleanup() {        }        public void declareOutputFields(                        OutputFieldsDeclarer declarer) {            declarer.declare(new Fields("word"));        }    }

This implementation is much simpler than the previous implementation, but functionally the same. The tuple sent to Basicoutputcollector is automatically associated with the input tuple, and the input tuple is automatically ack at the end of the Execute method.
In contrast, the bolts dealing with aggregation and merging often have to deal with a large number of tuples before they can be ack, and such a tuple is usually a multi-input tuple, so this is not ibasicbolt.

How does storm achieve high-efficiency reliability?

In Storm there is a special kind of task called: Acker, which is responsible for tracking the tuple tree of each tuple emitted by spout. When Acker discovers that a tuple tree has been processed, it is finished. It sends a message to the task that generated the tuple. You can set the number of Acker in a topology by Config.topology_ackers, and the default value is one. If you have more than one tuple within your topology, then set the number of Acker to a bit more, and the efficiency will be higher.

The best way to understand the reliability of storm is to look at the life cycle of a tuple and a tuple tree, when a tuple is created, whether it is created by spout or bolt, it is given a 64-bit ID, and Acker is using that ID to track all of the tuple's. Each tuple knows its ancestor's ID (the ID of the tuple sent out from spout), and whenever you launch a new tuple, its ancestor ID is passed to the new tuple. So when a tuple is ACK, it sends a message to Acker, telling it how the tuple tree has changed. Specifically: It tells Acker: I have finished, I have these sons tuple, you follow them. The following diagram shows the changes that occurred in this tuple tree after C was ack.

Tuple ACK Example

There are some details about how storm tracks the tuple, as mentioned earlier, and you can set how many acker you have in your topology. And this brings us to the question, when a tuple needs an ACK, which Acker does it choose to send this message?

Storm uses consistency hash to map a spout-tuple-id to Acker, because each tuple knows all of its ancestors ' tuple-id, so it can naturally figure out which Acker to notify for ACK. (All ancestors here refer to all the root tuple corresponding to this tuple.) Note here that because a tuple may exist in more than one tuple tree, there is only one.

Another detail of storm is how Acker knows which task the spout tuple should be given to handle. When a spout launches a new tuple, it simply sends a message to a suitable Acker and tells Acker its own ID (taskid), so that the storm has taskid-tupleid correspondence. When Acker discovers that a tree has finished processing, it knows which task to send a successful message to.

The Acker task does not explicitly track a tuple tree. For a tuple tree with tens of thousands of nodes, it can consume too much memory to keep track of so many tuple information.  Instead, Acker uses a different approach, making the amount of memory required for each spout tuple constant (bytes). This tracking algorithm is the key to how storm works, and it is also a major breakthrough.

A Acker task stores a mapping that spout-tuple-id to a pair of values. The first value of this pair is the taskid that creates the tuple, which is used to send messages when the tuple is processed. The second value is a 64-bit number called: "Ack Val", and Ack Val is a representation of the state of the entire tuple tree, no matter how large the tree is. It simply tupleid/ack all the Tupleid created on the tree together or (XOR).

When a Acker task finds an ACK Val becomes 0, it knows that the tree has been processed and finished. Because Tupleid is a random 64-digit number, the Ack Val happens to be 0 (not because all the created tuples have been completed) with minimal chance. You know, even if 10,000 ack per second occurs, it will take 500 billion years to get a mistake. And even if it encounters a mistake, it will only cause data loss if the tuple fails. An analysis of the detailed workflow of Acker can look at this article: the Acker workflow of the Twitter Storm source code analysis.

Now that you understand Storm's reliability algorithms, let's go through all the possible failure scenarios and see how storm can avoid data loss in each case.

1. Because the corresponding task is dead, a tuple is not ACK: The storm's timeout mechanism marks the tuple as failed after a timeout and can be re-processed.

2. Acker hangs up: in this case all spout tuples tracked by this Acker will time out and be re-processed.

3. Spout hangs up: in this case the message source sending the message to spout is responsible for resending the messages. For example, Kestrel and RABBITMQ will put all "in process" messages back into the queue after a client disconnects.

As you can see, Storm's reliability mechanism is fully distributed, scalable, and highly fault-tolerant.

Tuning reliability (Tuning reliability)

Acker task is very lightweight, so a topology doesn't need a lot of Acker. You can track its performance through the Strom UI (ID:-1). If its throughput does not look normal, then you need to add more acker.

If reliability is not important to you-you don't care about losing some of the data in some failed situations, you can get better performance by not tracking these tuple trees. Not tracking messages can reduce the number of messages in the system by half, because an ACK message is sent for each tuple. And it requires fewer IDs to hold downstream tuples, reducing bandwidth usage.

There are three ways to get rid of reliability. The first is to set the Config.topology_ackers to 0. In this case, Storm will call Spout's Ack method immediately after the spout launches a tuple. This means that the tuple tree is not tracked.

The second method is to remove the reliability at the tuple level. You can not specify MessageID when launching a tuple to achieve the purpose of not following a particular spout tuple.

The last method is that if you are not very concerned about the success of a certain part of a tuple tree, you can unanchor them when you launch the tuple. So that these tuples are not in the tuple tree, they will not be followed.

How Twitter storm ensures messages are not lost

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.