4.1 Introduction
Storm can ensure that every message sent out by spout is processed completely. This chapter will describe how the storm system achieves this goal, and will detail how developers should use storm's mechanisms to achieve reliable data processing.
4.2 Understanding the message is fully processed
Topologybuilder Builder =NewTopologybuilder (); Builder.setspout ("Sentences",NewKestrelspout ("kestrel.backtype.com", 22133, "Sentence_queue", NewStringscheme ())); Builder.setbolt ("Split",NewSplitsentence (), 10). shufflegrouping ("Sentences"); Builder.setbolt ("Count",NewWordCount (), 20). fieldsgrouping ("Split",NewFields ("word"));
A message (tuple) sent from spout may cause hundreds of thousands of messages (tuple) to be created based on this message.
Let's consider the streaming word count example: The storm task reads a complete English sentence from the data source (Kestrel queue) Each time, breaks the sentence into separate words, and finally, outputs each word in real time and the number of times it appears. In this case, each message sent from spout (every English sentence) triggers a lot of messages to be created, and the words that are separated from the sentences are the new messages that are created.
These messages form a tree structure that we call "tuple tree", which looks like 1:
This topology reads sentences (sentence) off of a Kestrel queue, splits the sentences into its constituent (composition) words, and then em Its for each word the number of times it had seen that word before. A tuple coming off the spout triggers (cause, trigger) many tuples being created based on it:a tuple for each word in the sentence and a tuple for the updated count of each word. The tree of messages looks something like this:
Figure 1 Example of a tuple tree
Under what conditions does storm think that a message sent from spout is fully processed? The answer is that the following conditions are also met:
- A tuple tree no longer grows
- Any messages in the tree are identified as "handled"
If a tuple tree derived from a message is not successfully processed for a specified period of time, the message is considered not to be fully processed. This timeout value can be configured with the task-level parameter config.topology_message_timeout_secs, with a default timeout of 30 seconds.
Storm considers a tuple coming off a spout "fully processed" when the tuple tree has been exhausted (exhausted, exhausted) and every mess The age of the tree has been processed. A-tuple is considered failed when its tree of messages fails to be fully processed within a specified timeout. This timeout can be configured (configured) on a topology-specific basis using the Config.topology_message_timeout_secs Configurat Ion and defaults to seconds.
public static final String Topology_message_timeout_secs
(The maximum amount of time given to the topology to fully process a message emitted by a spout. If the message is not acked within this time frame, Storm would fail the message on the spout. Some spouts implementations would then replay the message at a later time.) 4.3 Life cycle of messages
If the message is fully processed or not fully processed, how will storm proceed? To figure this out, let's look at the life cycle of messages sent from spout. Here is a list of interfaces that spout should implement:
Public Interface extends Serializable { void open (Map conf, topologycontext context, spoutoutputcollector collector) ; void close (); void nexttuple (); void ack (Object msgId); void fail (Object msgId);}
First, Storm uses the Nexttuple () method of the spout instance to request a message (tuple) from spout. After the request is received, Spout uses the spoutoutputcollector provided in the Open method to send one or more messages to its output stream. For each message sent, spout will provide a message ID to the messages, which will be used to identify the message. Assuming that we read the message from the Kestrel queue, spout will list the Kestrel team as the message ID of the messages set as the ID for this message. Send the message format to Spoutoutputcollector as follows:
_collector.emit (new Values ("Field1", "Field2", 3), msgId);
Next, the sent tuple is routed to the message processor bolt, and storm tracks the resulting tuple tree, and when a message is detected that the tuple trees are completely processed, Then storm will invoke the Ack method of the message source with the first Message-id as a parameter. Similarly, if a message processing time-out, the spout fail method for this message is called, and the MessageID of the message is passed in as a parameter when invoked.
Note: A message will only invoke ACK or fail by the spout task that sent it. If a spout in the system is run by more than one task, the message is only answered (ACK or fail) by the spout task that created it, and is never answered by another spout task.
Note that a tuple is acked or failed by the exact (precise) same task that Spout
created it. So if a Spout
is executing (performed) as many tasks across (through) the cluster, a tuple won ' t be acked or failed by a different task than the one that created it.
We continue to use the example of reading messages from the Kestrel queue to illustrate what spout needs to do under high reliability (assuming the spout name is Kestrelspout).
Let's start with a brief description of the Kestrel message queue: When Kestrelspout reads a message from the Kestrel queue, it says "open" a message in the queue. This means that this message is not really taken off in the queue, but instead sets the message to "pending" (pending) state, which waits for an answer from the client and is answered so that the message is actually removed from the queue. Messages in the "Pending" state are not sent to other message handlers, and if a client disconnects unexpectedly, all messages "open" by this client will be re-added to the queue. When the message is "open", the Kestrel queue also provides a unique identifier for the message.
Kestrelspout is using this unique identifier as the MessageID of this tuple. Later when the ACK or fail is called, Kestrelspout sends the ACK or fail together with MessageID to the Kestrel queue, and kestrel the message from the queue. OFF) or re-put it back in the queue.
English Original: Let's useKestrelSpout
Again to see what aSpout
Needs to does to guarantee message processing. WhenKestrelSpout
Takes a message off the Kestrel queue, it "opens" the message.This means the message was not actually taken off of the queue yet, but instead placed in a "pending" state waiting for a Cknowledgement the message is completed.While in the pending state, a message is not being sent to other consumers of the queue. Additionally, if a client disconnects all pending messages for that client is put back on the queue. When a message was opened, Kestrel provides the client with the data for the message as a and a unique ID for the message . TheKestrelSpout
Uses the exact ID as the "message ID" for the tuple when emitting theSpoutOutputCollector
. Sometime later on, whenack
Orfail
is called on theKestrelSpout
, theKestrelSpout
Sends an ACK or fail message to Kestrel with the message ID to take the message off the queue or has it put back on.
4.4 Storm's Reliability API
To use the reliable processing features that storm provides, we need to do two things:
- Whenever a new node is created in a tuple tree, we need a clear notification of storm; (first, you need-to-tell storm whenever you ' re creating-a-the-new link in the Tre E of tuples)
- When we're done with a separate message, we need to tell storm the change state of the tuple tree. (Second, you need-to-tell Storm if you hadfinished processing an individual tuple.)
With the above two steps, Storm can detect when a tuple tree is fully processed and invoke the associated ACK or fail method. Storm provides a simple and straightforward way to accomplish these two steps.
By doing both these things, Storm can detect when the tree of tuples is fully processed and can ack or fail the spout TUPL e appropriately. Storm's API provides a concise of doing both of these tasks.
Adds a new node to the node specified in the tuple tree, which we call anchoring (anchoring). Anchoring is done at the same time as we send the message. To make it easier to explain the problem, we use the following code as an example. The bolt for this example decomposes a message (tuple) containing an entire sentence into a series of sub-messages, each of which contains a word (tuple).
Public classSplitsentenceextendsBaserichbolt {outputcollector _collector; Public voidPrepare (Map conf, topologycontext context, Outputcollector collector) {_collector=collector; } Public voidExecute (tuple tuple) {String sentence= tuple.getstring (0); for(String word:sentence.split ("") {_collector.emit (tuple,NewValues (word)); } _collector.ack (tuple); } Public voiddeclareoutputfields (Outputfieldsdeclarer declarer) {Declarer.declare (NewFields ("word")); } }
Take a look at this execute method, the first parameter of the emit is the input tuple, the second parameter is the output tuple, this is actually through the input tuple anchoring a new output tuple. Because this "word tuple" is anchoring in the "sentence tuple", if one of the words is processed in error, then the whole sentence will be re-processed. As a comparison, let's see what happens if you launch a new tuple using this line of code.
_collector.emit (new Values (word));
If the message is sent in this way, it will cause the message to not be anchored. If the message processing in this tuple tree fails, the root message that derives from this tuple tree will not be resent. Depending on the level of fault tolerance of the task, it is sometimes appropriate to send a non-anchored message. This method launches the new tuple from the original tuple tree (unanchoring), and if the tuple fails, the entire sentence will not be re-processed. Whether you want to anchoring or unanchoring is entirely up to your business needs.
An output message can be anchored to a tuple on one or more input messages, which is useful when doing a join or aggregation (streaming joins or aggregations). A multi-anchored message tuple processing fails, causing multiple spout messages associated with it to be resent replayed. Multiple anchors are implemented by specifying multiple input message tuple in the emit method:
New arraylist<tuple>New Values (1, 2, 3));
Multiple anchors Add the anchored message tuple to more than a tuple tree.
Note: Multiple bindings can break the traditional tree structure and thus form a dags (a forward-free graph), as shown in 2:
Figure 2 diamond-shaped structure with multiple anchors
Storm implementations can handle dags like a tree.
We construct this tuple tree through anchoring, and the last thing to do is to tell Storm when you're done with a tuple, through the ACK and fail methods of the Outputcollector class, if you look back at SplitSentence
the example, You can see that the "tuple of sentences" invokes an ACK after all the "word tuple" is emitted.
You can call OutputCollector
the Fail method to immediately mark the tuple from the source of the message as fail, for example, you query the database, found an error, you can immediately fail the input tuple, so that the tuple can be quickly re-processed, Because you don't have to wait for that timeout time to let it fail automatically.
Each tuple you process must be either ACK or fail. Because storm tracks each tuple to occupy memory. So if you don't ack/fail every tuple, you'll end up seeing outofmemory errors.
Many bolts follow a specific process: read a message, send a child message that it derives from, and answer the message at the end of the execute. General Filters (filter) or simple processing functions are applications of this type. Storm has a Basicbolt interface that encapsulates the process described above. Example splitsentence can be overridden by using Basicbolt:
Public classSplitsentenceextendsBasebasicbolt { Public voidExecute (tuple tuple, basicoutputcollector collector) {String sentence= tuple.getstring (0); for(String word:sentence.split ("") {collector.emit (NewValues (word)); } } Public voiddeclareoutputfields (Outputfieldsdeclarer declarer) {Declarer.declare (NewFields ("word")); } }
In this way, the code is a little bit simpler than it was before, but the functionality of the implementation is the same. Messages sent to Basicoutputcollector are automatically anchored to the input message, and when execute finishes, the input message is automatically answered.
In many cases, a message requires a deferred response, such as aggregation or join. All previous input messages are answered only after a result is obtained based on a set of input messages. and aggregations and joins most of the time are multiple anchors to the output message. However, these features are not ibasicbolt to handle.
How does 4.5 storm achieve high efficiency and reliability?
The Storm system has a set of special tasks called "Acker" that are responsible for tracking each message in a DAG (directed acyclic graph). Whenever a DAG is found to be fully processed, it sends a signal to the spout task that created the root message. The degree of parallelism of Acker tasks in a topology can be set by configuring parameter Config.topology_ackers. The default Acker task parallelism is 1, and when there are a large number of messages in the system, the concurrency of the Acker task should be improved appropriately.
In Storm there is a special kind of task called: Acker, which is responsible for tracking the tuple tree of each tuple emitted by spout. When Acker discovers that a tuple tree has been processed, it is finished. It sends a message to the task that generated the tuple. You can set the number of Acker in a topology by Config.topology_ackers, and the default value is one. If you have more than one tuple within your topology, then set the number of Acker to a bit more, and the efficiency will be higher.
To understand the storm reliability processing mechanism, we start with the study of the life cycle of a message and the management of a tuple tree. When a message is created (whether in spout or bolt), the system assigns a 64bit random value as the ID for the message. These random IDs are Acker used to track the tuple tree derived from the spout message.
Each message knows the ID of the root message corresponding to the tuple tree in which it resides. Whenever the bolt is reborn as a message, the MessageID of the root message corresponding to the tuple tree is copied into the message. When the message is answered, it sends information about the change in the tuple tree to the Acker that tracks the tree. For example, he would tell Acker: This message has been processed, but I've derived some new information to help track it.
The best way to understand the reliability of storm is to look at the life cycle of a tuple and a tuple tree, when a tuple is created, whether it is created by spout or bolt, it is given a 64-bit ID, and Acker is using that ID to track all of the tuple's. Each tuple knows its ancestor's ID (the ID of the tuple sent out from spout), and whenever you launch a new tuple, its ancestor ID is passed to the new tuple. So when a tuple is ACK, it sends a message to Acker, telling it how the tuple tree has changed.
For example, suppose that messages D and e are derived from message C, which shows how the tuple tree changes when message C is answered.
Because D and e are added to the tuple tree when C is removed from the tree, the tuple tree is not considered premature to be fully processed.
Let's delve into how storm tracks the tuple tree. As mentioned earlier, there can be any number of Acker in the system, then, whenever a message is created or answered, how does it know which Acker should be notified?
The system uses a hashing algorithm to determine, based on the messageid of the spout message, which Acker traces the tuple tree derived from this message. Because each message knows the MessageID of the root message that corresponds to it, it knows which Acker to communicate with.
When spout sends a message, it notifies the corresponding Acker that a new root message was generated, and Acker creates a new tuple tree. When Acker found that the tree was fully processed, he would notify the corresponding spout mission.
How are the tuple tracked? There are thousands of messages in the system, and if a tree is built for every message sent by spout, the memory will be exhausted soon. Therefore, you must use a different strategy to track each message. With the use of a new tracking algorithm, storm only needs a fixed amount of memory (about 20 bytes) to track a tree. This algorithm is the core of the correct operation of storm, and is the biggest breakthrough of storm.
A Acker task stores a mapping that spout-tuple-id to a pair of values. The first value of this pair is the taskid that creates the tuple, which is used to send messages when the tuple is processed. The second value is a 64-bit number called: "Ack Val", and Ack Val is a representation of the state of the entire tuple tree, no matter how large the tree is. It simply tupleid/ack all the Tupleid created on the tree together or (XOR).
When a Acker task finds an ACK Val becomes 0, it knows that the tree has been processed and finished. Because Tupleid is a random 64-digit number, the Ack Val happens to be 0 (not because all the created tuples have been completed) with minimal chance. You know, even if 10,000 ack per second occurs, it will take 500 billion years to get a mistake. And even if it encounters a mistake, it will only cause data loss if the tuple fails. An analysis of the detailed workflow of Acker can look at this article: the Acker workflow of the Twitter Storm source code analysis.
4.7 Fault tolerance at all levels of the cluster
So far, you've understood Storm's reliability mechanisms and learned how to choose different levels of reliability to meet your needs. Next, let's look at how storm ensures that data is not lost in all situations.
4.7.1 Task level failure
- because the corresponding task is hung up: a tuple that has not been ack:storm The timeout mechanism will mark the tuple as failed after the timeout and can be re-processed.
- Acker Task failure: If the Acker task itself fails, all messages it holds before it fails will fail due to a timeout. The fail method of the spout will be called.
- spout Task failure: in this case, the external device (such as MQ) on which the spout task is docked is responsible for the integrity of the message. For example, in the case of a client exception, the Kestrel queue will put all messages in the pending state back into the queue. For example, Kestrel and RABBITMQ will put all "in process" messages back into the queue after a client disconnects.
4.7.2 task slot (slot) failure
- Worker failed. Each worker contains several bolt (or spout) tasks. Supervisor is responsible for monitoring these tasks, and when the worker fails, supervisor attempts to restart it on the local machine.
- Supervisor failed. Supervisor is stateless, so the failure of supervisor does not affect the currently running task, as long as it is restarted in a timely manner. The supervisor is not self-lifting and requires external monitoring to restart in a timely manner.
- Nimbus failed. Nimbus is stateless, so the failure of Nimbus does not affect the currently running task (a new task cannot be submitted when the Nimbus fails), as long as it is restarted in a timely manner. The Nimbus is not self-lifting and requires external monitoring to restart in a timely manner.
4.7.3. cluster node (machine) failure
- Node failure in the Storm cluster. At this point Nimbus will move all running tasks on this machine to other available machines.
- Node failure in the Zookeeper cluster. The zookeeper guarantees that less than half of the machine's downtime is still operational and that the faulty machine can be repaired in a timely manner.
4.8 Adjusting the reliability level
The Acker task is lightweight, so there is no need for too many Acker in the topology. You can observe the throughput of the Acker task through the Storm UI, and if you seem to have insufficient throughput, you need to add additional Acker.
If you do not require each message to be processed (you allow some information to be lost during processing), you can turn off the reliable processing mechanism of the message, thus obtaining better performance. The reliable processing mechanism for closing messages means that the number of messages in the system is halved (no answer is required for each message). In addition, the reliable processing of the shutdown message reduces the size of the message (it does not require each tuple to record its root ID), thereby saving bandwidth.
There are three ways to handle the reliable handling of messages:
- Set the parameter config.topology_ackers to 0, by this method, when the spout sends a message, its Ack method will be called immediately;
- The second method is to not specify the MessageID of this message when spout sends a message. You can use this method when you need to turn off the reliability of a particular message.
- Finally, if you do not care about the reliability of a descendant message derived from a message, then the message derived from it is not anchored when it is sent, that is, the input message is not specified in the emit method. Because these descendant messages are not anchored in any tuple tree, their failure does not cause any spout to resend the message.
4.9 Summary
This chapter describes how a storm cluster can reliably handle data. With the help of the innovative tuple tree tracking technology, storm efficiently uses the data response mechanism to ensure that data is not lost.
No single point exists in the storm cluster except Nimbus, and any node can fail to ensure that the data is not lost. The Nimbus is designed to be stateless and will not affect running tasks as long as it can be restarted in a timely manner.
Apache Storm Learning III-message reliability