Storm concurrency detailed

Source: Internet
Author: User

Worker processes (worker process)

A worker is a process that runs specific processing logic in Spout/bolt. The topology executes across one or more worker processes. Each worker process is a physical JVM and the topology performs a subset of all tasks. For example, if the topology for merging parallelism is 300, 50 workers have been assigned, and each worker will perform 6 tasks, Storm will attempt to publish the task evenly on all workers.

Actuator (Executor)

Executor is called a physical thread, and each worker can contain more than one executor.

Tasks (Task)

Task is the specific processing logical object, by default, the executor and the task correspond, that is, an executor corresponding to a task.

The relationships between work processes, actuators, and tasks are as follows:

One node of a storm cluster may have one or more worker processes running on one or more topologies, a subset of the execution topology of a worker process. The worker process belongs to a specific topology and may run one or more actuators for one or more components (spout or bolts) of the topology. A running topology consists of multiple processes running on multiple nodes within a storm cluster.

One or more actuators may run within a worker process, and the executor is a thread produced by the worker process that may run one or more tasks for the same component (spout or bolt).

Tasks perform real data processing, and each spout or bolt implemented in the code is executed across the cluster as many tasks. The number of tasks for a component always runs through the entire life cycle of the topology, but the number of actuators (threads) for a component can change over time. By default, an executor contains a number of tasks, that is, Storm uses each thread to perform a task.

Configuring the degree of parallelism for a topology 1. Number of worker processes

The number of worker processes represents the topology of the different nodes in the cluster can create how many worker processes love you.

Configuration parameters are:topology_workers

It can also be set through the Java API:

[HTML]View PlainCopy
    1. Config#setnumworkers

2. Number of actuators (threads)

The number of actuators refers to how many threads each component produces.

This parameter can only be configured by the Java API temporarily:

[HTML]View PlainCopy
    1. Topologybuilder#setspout ()
    2. Topologybuilder#setbolt ()

3. Number of tasks

The number of tasks represents how many tasks each component creates.

Configuration options:topology_tasks

It can also be configured through the Java API:

[HTML]View PlainCopy
    1. Componentconfigurationdeclarer#setnumtasks ()
    2. T Setnumtasks (Java.lang.Number val)



Topology Example

Below we define a topology called Mytopology, consisting of three components of a spout component (bluespout), two bolt components (Greenbolt and Yellowbolt), with the following code:

[HTML]View PlainCopy
  1. Config conf = new config ();
  2. Conf.setnumworkers (2);
  3. Topologybuilder.setspout ("Blue-spout", New Bluespout (), 2);
  4. Topologybuilder.setbolt ("Green-bolt", New Greenbolt (), 2)
  5. . Setnumtasks (4)
  6. . shufflegrouping ("Blue-spout");
  7. Topologybuilder.setbolt ("Yellow-bolt", New Yellowbolt (), 6)
  8. . shufflegrouping ("Green-bolt");
  9. Stormsubmitter.submittopology (
  10. "Mytopology",
  11. Conf
  12. Topologybuilder.createtopology ()
  13. );

The mytopology topology is described as follows:

1. The topology uses two worker processes (worker).

2.Spout is a bluespout instance with an ID of "blue-spout" and a degree of parallelism of 2 (resulting in two actuators and two tasks).

3. The first Bolt has an ID of "Green-bolt", a degree of parallelism of 2, a task count of 4, a Greenbolt instance (which generates two actuators and 4 tasks) using a random grouping to receive the blue-spout of the emitted tuple.

4. The second Bolt is an instance of Yellowbolt (which produces 6 actuators and 6 tasks) with an ID of "Yellow-bolt", a parallelism of 6, and the use of a randomly grouped method to receive "Green-bolt" of the emitted tuple.


In summary, the topology has a total of two worker processes (workers), 2+2+6=10 (Executor), 2+3+6=12 tasks. Therefore, each worker process can be assigned to a 10/2=5 executor, 12/2=6 a task. By default, an executor performs a task, but if the number of tasks is specified, the task is evenly assigned to the executor, so that an executor of Greenbolt's instance "Green-bolt" will be assigned to 4/2 tasks.

The topology of the mytopology and its corresponding resource allocations are as follows:

Dynamically set the concurrency of a topology

Storm supports the dynamic change (increase or decrease) in the number of worker process and the number of executor, called rebalancing, without restarting Topolog. There are two ways to achieve the rebalancing of the topology:

1. using the Storm Web UI

2. Use the Storm Rebalance command (recommended)

Here's how to use the command line:

[HTML]View PlainCopy
    1. # Reconfigure the topology
    2. # "Mytopology" topology uses 5 worker processes
    3. # "Blue-spout" spout use 3 x Executor
    4. # "Blue-spout" Bolt uses 10 x executor
    5. # storm rebalance mytopology-n 5-e blue-spout=3-e yellow-bolt=

Note: "Mytopology" is the name of the topology, and "Blue-spout" and "Yellow-bolt" are the names of the components.

Storm concurrency detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.