Various components running in a storm cluster and their parallel

Source: Internet
Author: User
First, the components running in StormWe know that the power of storm is that it is easy to scale its computing power horizontally across the cluster, dividing the entire operational process into separate tasks for parallel computing in the cluster.      In storm, a task is a spout or bolt instance running in a cluster. To facilitate understanding of how storm handles the tasks we assign to it in parallel, let me first describe the four components involved in topology in a cluster: Nodes (machines): Nodes in a cluster, which are the nodes that work together to perform topology. Workers (JVMs): A worker is a separate JVM process.      Each node can be configured to run one or more workers, and a topology can specify how many workers to execute. Executors (threads): A thread running in a worker JVM. A worker process can execute one or more executor threads. A executor can run multiple tasks,storm by default, one per executor assigned a task. Tasks (Bolt/spout instances): Tasks are instances of spouts and bolts, which are specifically handled by executor threads.
second, parallel in Storm (take Wordcounttopology as an example)We can adjust the amount of parallelism in our work by configuration, and if we don't set it, storm defaults to 1 of the number of concurrent processes. Assuming we do not configure wordcounttopology alone, our topology execution is shown in the following figure: one of our nodes assigns a worker to our topology, This worker initiates a executor thread for each task.
2.1 Increased workers for topologyOne of the simplest ways to improve topology computing power is to add workers to our topology. Storm provides us with two ways to add workers: through configuration files or through program settings. Description: How many worker process configuration options are created for topology in a cluster: Topology_workers is configured in code: Config#setnumworkers configures WORKERS with the Config object:
Config config = new config (); Config.setnumworkers (2);
Note: No matter how many workers are set up under Localmode, there will eventually be only one worker JVM process. 2.2 Configuring Executors and TasksAs we've said earlier, Storm creates a task for each topology component, and the default one executor only handles one task. Task is an instance of spouts and bolts, a executor thread can be handled by more than one tasks,tasks is a process that really handles specific data, and the spout and bolts we write in code can be seen as being performed by tasks distributed in the cluster. The number of tasks is generally constant throughout the topology run, but the executors of the components can vary. This also means: Threads<=tasks. 2.2.1 Setting the number of executor (thread)Specify the executors of a component by setting parallelism hint. Description: The number of executor configuration options that each component produces:. Configuration in Code: Topologybuilder#setspout () Topologybuilder#setbolt () Note that as of Storm 0.8 the Parallelism_hint parameter now           Specifies the initial number of executors (not tasks!) for that Bolt. Below we specify that the number of concurrent sentensespout is 2, the spout component will have two executors, each executor assigned a task, and its topology run as shown in the following figure: Builder.setspout ( SENTENCE_SPOUT_ID, SPOUT, 2);
2.2.2 Set the number of tasks    The Setnumtasks () method is used to specify the number of tasks for a component. Description: How many task configuration options are created per component: Topology_tasks is configured in code: Componentconfigurationdeclarer#setnumtasks () below we are for Splitsentencebolt Set 4 tasks and 2 executors, so that each executor thread will be assigned to perform 4/2=2 tasks, and then assign 4 tasks to Wordcountbolt, each of which is executed by a executor. The topology is shown in the following figure:
Builder.setbolt (split_bolt_id, Splitbolt, 2). Setnumtasks ( 4). shufflegrouping (sentence_spout_id); Builder.setbolt (count_bolt_id, Countbolt, 4). Fieldsgrouping (split_bolt_id, Newfields ("word"));
If you initially allocate 2 workers, the topology is run as shown in the following figure: examples of a city topologyThe following illustration shows a panoramic view of the actual topology, topology consists of three components, one spout:bluespout, two Bolt:greenbolt, and Yellowbolt.
As shown above, we have configured two worker processes, two spout threads, two greenbolt threads, and six Yellowbolt threads, then each worker process will have 5 executor threads if distributed to the cluster. Here's a look at the specific code:
java config conf = new Config (); Conf.setnumworkers (2); Use the worker processes

Topologybuilder.setspout ("Blue-spout", New Bluespout (), 2); Set parallelism hint to 2

Topologybuilder.setbolt ("Green-bolt", New Greenbolt (), 2). Setnumtasks (4). shufflegrouping ("Blue-spout");

     topologybuilder.setbolt ("Yellow-bolt", New Yellowbolt (), 6). Shufflegrouping ("Green-bolt"); nbsp    stormsubmitter.submittopology ("Mytopology", conf, Topologybuilder.createtopology ());
      Of course, Storm also has a parameter to control the number of concurrent topology: topology_max_task_parallelism:  This parameter can control the maximum number of executor on a component. It is commonly used to test the topology maximum number of threads in local mode. Of course we can also set it in code:  config#setmaxtaskparallelism () .       Four, How to change a parallelism in a run topology       A good feature of storm is the ability to dynamically modulate the number of worker processes or executor threads during the topology run without restarting the topology. This mechanism is called rebalancing.       We have two ways to balance a topology: balancing through the CLI tool storm through the Storm Web UI         The following is a CLI tool Example of application:

 # Reconfigure The Topology "Mytopology" to use 5 worker processes, # the spout ' blue-spout ' to use 3 executors a ND # The bolt       "Yellow-bolt" to use executors.       $ Storm rebalance mytopology-n 5-e blue-spout=3-e yellow-bolt=10 













Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.