1 Basic concepts of storm parallelism
A machine in a storm cluster can run one or more workers, corresponding to one or more topologies. 1 worker processes run 1 or more excutor threads. Each worker belongs to a topology. Executor is a single thread. Each of the 1 executor runs 1 or more tasks of the same component (spout or bolt). 1 tasks perform actual data processing. A practical example:
what |
Description |
Configuration option |
how to S ET in your code (examples) |
#worker processes |
How many worker processes to CRE Atefor the Topologyacross machines in the cluster. |
config#topology_workers |
config#setnumworkers |
#executors (threads) |
How many executors to Spawnper component. |
? |
topologybuilder#setspout () and topologybuilder#setbolt () Note that as of Storm 0.8 Theparallelism_hint parameter now specifies the initial number of executors, not tasks!, for that Bolt. |
#tasks |
How many tasks to Create per component. |
Config#topology_tasks |
componentconfigurationdeclarer#setnumtasks () |
Here's an example code snippet to show these settings in practice:configuring the parallelism of a Storm bolt
1
2
3
|
Topologybuilder.setbolt ("Green-bolt", New Greenbolt (), 2)
. Setnumtasks (4)
. Shufflegrouping ("Blue-spout");
|
In the above code "we configured Storm to run" The bolt Greenbolt with a initial number of both executors and four associate D tasks. Storm would run, and the other tasks per executor (thread). If you don't explicitly configure the number of tasks, Storm would run by default one task per executor.
Detailed reference: http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/
2 Whether to increase the number of workers
(1) It is best to use only one worker for a topology on a machine, the main reason is to reduce the data transfer between worker
How many Workers should I use?
The total number of workers are set by the Supervisors–there ' s some number of JVMs slots each supervisor would superintend. The thing you set on the topology are how many worker slots it would try to claim.
There ' s no great reason to use more than one worker per topology per machine.
With one topology running in three 8-core nodes, and parallelism hint, each bolt gets 8 executors per machine, i.e. one For each core. There is three big benefits to running three workers (with 8 assigned executors each) compare to running Say ( One assigned executor each).
First, data that was repartitioned (shuffles or group-bys) to executors in the same worker would not has to hit the TRANSFE R buffer. Instead, tuples is deposited directly from the send to receive buffer. That ' s a big win. By contrast, if the destination executor were on the same machine in a different worker, it would has to go send-and wo Rker transfer, local socket, worker recv, exec recv buffer. It doesn ' t hits the network card, but it's not as big a win as when executors is in the same worker.
Second, you ' re typically better off with three aggregators have very large backing cache than having Twenty-four Aggrega Tors had small backing caches. This reduces the effect of skew, and improves LRU efficiency.
Lastly, fewer workers reduces control flow chatter.
(2)
Increasing the number of workers (responsible for running one or more executors for one or more components) might also giv ES you a performance benefit, but this also relative as I found from this discussion Wherenathanmarz says
Having more workers might has better performance, depending on where your bottleneck are. Each worker have a single thread this passes tuples on to the 0MQ connections for transfer to other workers, so if you ' re b Ottlenecked on CPUs and each worker are dealing with lots of tuples, more workers would probably net you better throughput.
So basically there are no definite answer to this, you should try different configuration based on your environment and Des IGN. number of executor 3
Executor is the true degree of parallelism (de facto parallelism). (The number of tasks is the degree of parallelism you want to set)
Executor initial number =spout number +bolt number +acker number (these add up to the number of tasks. )
Number of spout, bolt number, number of Acker the runtime is unchanged, but the number of executor can vary.
4 Whether you need to increase the number of tasks
A task exists only to topology the flexibility of the extension, regardless of the degree of parallelism.
Disclaimer:i wrote the article you referenced in your question above.
However I ' m a bit confused by the concept of "task". is a task an running instance of the component (spout or bolt)? A executor have multiple tasks actually is saying the same component are executed for multiple times by the executor, am I correct?
Yes, and yes.
Moreover in a general parallelism sense, Storm would spawn a dedicated thread (executor) for a spout or bolt, and what's is Co Ntributed to the parallelism is a executor (thread) having multiple tasks?
Running more than one task per executor does not increase the level of parallelism-a executor always have one thread th At it uses in the IT tasks, which means that tasks run serially on an executor.
As I wrote in the article please note that:the number of executor threads can is changed after the topology have been star Ted (seestorm rebalance command). The number of the tasks of a topology is static.
And by definition there is the invariant of #executors <= #tasks.
So one reason for have the executor tasks per the flexibility to expand/scale up the topology th Rough thestorm Rebalance command in the future without taking the topology offline. For instance, imagine your start out with a Storm cluster of the machines but already know that next week another ten boxes W Ill be added. Here's could opt for running the topology at the anticipated parallelism level of machines already on the initial Boxes (which is of course slower than). Once the additional boxes is integrated you can thenstorm rebalance the topology-make full use of all boxes with Out of any downtime.
Another reason to run the executor is a for (primarily functional) testing. For instance, if your dev machine or CI server was only powerful enough to run, say, 2 executors alongside all the other St Uff running on the machine, you can still run for the tasks (here:15 per executor) to see whether code such as your custom Sto RM grouping is working as expected.
In practice we normally we run 1 task per executor.
Ps:note that Storm would actually spawn a few more threads behind the scenes. For instance, each executor have its own "send Thread", which is the responsible for handling outgoing tuples. There is also "system-level" background threads for e.g. ACKing tuples that run alongside "your" threads. IIRC the Storm UI counts those acking threads in addition to "your" threads. Transferred from: http://stackoverflow.com/questions/17257448/what-is-the-task-in-twitter-storm-parallelism
What is a running topology made of: worker processes, executors, and tasks.
Storm is based on the following 3 main sections to differentiate one of the actual running topologies in the Storm cluster: Worker process executors (threads) Tasks
The following diagram simply shows their relationship:
The 3 paragraphs in the previous figure are as follows: 1 of the machines in the storm cluster may run multiple worker processes (possibly 1) that belong to multiple topologies (possibly 1). Each worker process runs the executors of a particular topology. 1 or more excutor may run on 1 separate worker processes, and each 1 executor is subordinate to 1 threads generated by worker process. Each of the 1 executor runs 1 or more tasks of the same component (spout or bolt). 1 tasks perform actual data processing.
1 worker processes perform a subset of a topology. 1 worker processes belong to 1 specific topologies and run 1 or more executor of 1 or more components (spout or bolts) of this topology. A running topology includes many of these processes on many of the machines in the cluster.
1 Executor are 1 threads generated by 1 worker processes. It may run 1 or more tasks with 1 identical components (spout or bolts).
1 tasks perform actual data processing, and each spout or bolt that you implement in code is equivalent to many of the tasks that are distributed throughout the cluster. In the life cycle of 1 topologies, the number of tasks for 1 components is always the same, but the number of executor (threads) of 1 components can change over time. This means that the following conditions are always true: the number of thread <= the number of tasks. By default, the number of tasks is the same as the number of executor, for example, Storm will run 1 tasks on every 1 threads. Configure the concurrency of the topology
Note that storm's term "concurrency (parallelism)" is specifically used to describe the number of the so-called parallelism hint, which represents the initial executor (threads) of 1 components. In this document we use the general meaning of the term "concurrency" to describe how much you can configure not only the number of executor, but also the number of worker processes and the number of tasks that can be 1 topologies. We will put this in particular when we use the narrow definition of concurrency.
The following subsections give you some different configuration options and how you can set them up in your code. There are several ways to set up, and the table lists several of them. Storm currently has the following configuration priorities: Defaults.yaml < Storm.yaml < configuration of specific topologies < configuration of Internal-specific components < configuration of external-specific components.
number of worker processes Description: 1 Topologies How many worker process configuration options are distributed to machines in a cluster: How topology_workers is set in code (example): Config#setnumworkers