Storm concurrency model and ACK mechanism processing

Source: Internet
Author: User
Tags ack emit file system

From the big to the small words we see it has so 5 levels, the simplest storm is a cluster, cluster is a level, the second level is a relatively clear meaning, is the supervisor,supervisor corresponding to the level is a host, is a node , is a machine this level, and then a machine it has a lot of worker,worker is actually corresponding to the process level, is the progress level, the machine runs a few processes, the provision of 4 workers, on 4 processes, Each worker inside runs a executor, corresponding to a thread, then a thread running one or more tasks, usually a task in many cases, of course, sometimes corresponding to multiple task,task is the lowest level of object, Is the thread operation is a one-to-one object.

Concurrency model
– There are several levels of storm, from a cluster perspective, how many nodes are in a machine, and how many processes are on top of each node.
– From topology's point of view, is a topology to run several workers, a worker will run how many executor, a executor above will run a
One or more task,task are objects that correspond to spout or bolts.




Concurrency is the number of executor, where the number of workers is set, the number of executor, the task number how to set it.
The default task number is the number of concurrent, of course you can use Setnumtasks () to set a different task number, of course, can only be more, not less, such as the above figure this greenbolt is.
Summary is very simple, is the worker starts to set how many is topology inside has how many, executor degree of parallelism there is how many is set how much, task if not set that is a executor corresponds to a task, If there is a write that is the task count is the number of tasks set divided by executor number to decide.


data transfer between worker

1.executor Execute Spout.nexttuple ()/bolt.execute (), emit tuple into executor transfer queue
2.executor transfer thread puts himself transfer the tuple inside the queue to worker transfer queue
3.worker transfer thread sends a tuple serialized inside the transfer queue to a remote worker
4.worker receiver thread data is received from the network, deserialized into a tuple to the corresponding executor receive queue
5.executor receive thread removes a tuple from its own receive queue, calling Bolt.execute ()


Why the data transfer is so important, because topology's spout to Bolt,bolt to Bolt is a tuple transfer that is done through the worker.
In the third step above, the worker transfer thread takes the data from the worker transfer queue, makes a simple judgment, then serializes it and sends it over the network to the worker in the target task. How does it know which worker the target task is in, and the answer is ZK, because each of my workers writes state information to the ZK, the state information inside the worker has what executor, what task, and then through grouping, That by grouping calculate which task to send, then I find out which worker on this task, then the worker corresponds to which machine which port, then I know to which to send data, that sent a tuple to the worker
The fourth step above, how does it know which task the data should be sent to the worker, because a worker may have multiple executor and multiple tasks, this is in the serialized data, before the serialization of two things, one thing is Task-id, One thing is the data of the tuple itself.

If the worker communicates between different tasks, there is no need to go through the process of the network, which is actually an optimization, and the worker transfer thread discovers another task local to the target task, That's when I put the data in executor's receive queue, and this time it's done, avoiding unnecessary network transmissions, so in extreme cases, if you topology only one worker, then it doesn't have any network transmission, It's all done in the memory queue.


No status
Stateless, Nimbus/supervisor/worker state exist in ZK inside, a few worker information exists in the local file system inside, there is also nimbus the program jar package also exists in the local system inside, its advantage is even nimbus hang off, Can be quickly restored, recovery is to know to re-start a and then go to ZK to read the corresponding data, because so the storm is very stable, very robust, in fact, through the ZK did a good decoupling, any module hanging after it can also do a good job, That is, for example, supervisor hangs, workers can work as usual, because there is no direct relationship between supervisor and the worker, all through the zookeeper or local files for state transmission

Single Center
Storm is also the architecture of Master/slave, Master is Nimbus,slave is supervisor, the advantage of single center is that the whole topology scheduling becomes very simple, because there is only one center, it is decided to have a person to decide, No arbitration is possible
Master This will also have a single center problem, is a single point of failure, of course, this problem is not very large, when no new topology sent up, run in the cluster inside the topology will not have any impact, of course, the extreme situation is, At this time some supervisor machine hang off, no nimbus will not do a re-dispatch, this will still have a certain impact.

Another single point of the problem, is the problem of single-point performance, because now topology jar package or through the Nimbus to distribute, the client commit or submit to Nimbus, and then supervisor to Nimbus above to download, when the file is relatively large, Supervisor machine more time, Nimbus will become a bottleneck, network card will soon be full, of course, in order to solve this problem, Storm also has some ways, such as the distribution of documents to peer.


Good isolation
Because the real work is executor and worker,nimbus/supervisor to play a role in controlling the topology process, the data flow in the executor and workers to do, all the data transmission is done between the worker, No need to rely on Nimbus and supervisor, especially without the help of supervisor, this design is very successful, in Hadoop, tasktracker exist, all shuffle through tasktracker, so, All the tasktracker process once hung, the map above it is to be re-executed, because no one to do the data transmission services, Storm is not the same, all the data transmission in the workers, even if supervisor hung, data transmission between the workers, Supervisor hanging, a little will not be affected, more incredible is that nimbus/supervisor all hung up, the worker can work, this is unthinkable in Hadoop, all the Master/slave are hung up, Does the task still work ...


The ACK mechanism of storm
–tuple Tree
– In a topology in storm, spout emits a tuple (source) as a message through the Spoutoutputcollector emit () method. When multiple bolt processing is defined in the topology, one or more new tuples may be generated. The source tuple and the newly created tuple form a tuple tree. When the entire tree is processed, only one tuple is fully processed, and the tuple processing of any one of the nodes fails or times out, and the whole tree fails.
– The default timeout for messages in storm is 30 seconds, see the configuration of topology.message.timeout.secs:30 in Defaults.yaml. You can also specify a time-out by using the Conf.setmessagetimeoutsecs method when defining topology.

–storm the so-called message reliability is that storm guarantees that each tuple can be fully processed by toplology. And the result of the process is either success or failure. There are two possible reasons for failure, namely node processing failure or processing time-out.
–storm bolts have Basicbolt and Richbolt, in Basicbolt, Basicoutputcollector is automatically associated with the input tuple when emit data, At the end of the Execute method, the input tuple is automatically ack (with a certain condition).
– To implement an ACK when using Richbolt, it is necessary to display the source tuple specifying the data when emit the data, i.e. Collector.emit (Oldtuple, newtuple); And the ACK of the source tuple needs to be called after execute succeeds. (Manual ackthis._collector.ack (input))

– If reliability is not important to you-you don't care about losing some of the data in some failed situations, you can get better performance by not tracking these tuple trees
There are three ways to get rid of reliability:
– The first is to config.topology_ackers the configuration

Conf.put (Config.  Topology_acker_executors, 0);
Set to 0. In this case, Storm will call Spout's Ack method immediately after the spout launches a tuple. This means that the tuple tree is not tracked.
– The second method is to remove reliability at the tuple level. You can not specify when you launch a tuple. MessageIDTo achieve the goal of not tracking a particular spout tuple.
– The last method is that if you are not very concerned about the success of a certain part of a tuple tree, you can unanchor them when you launch these tuples. So that these tuples are not in the tuple tree, they will not be followed.

Acker's tracking algorithm is one of the major breakthroughs in storm, and for any large tuple tree, it only needs a constant 20 bytes to track it.
The principle of –acker tracking algorithm: Acker for each spout-tuple to save a ack-val checksum, its initial value is 0, and then each time a tuple or ack a tuple, the ID of this tuple will be different from this check value or a bit, And the resulting value is updated to the new value of Ack-val. Then assuming that each of the sent tuple is ACK, then the last Ack-val value must be 0. Acker is based on whether the ack-val is fully processed, if 0 is considered to be fully processed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.