Storm 8: The degree of parallelism

Source: Internet
Author: User


(a) the degree of parallelism of a storm topology can be set from the following 4 dimensions:
1, node (server): Refers to the number of supervisor servers in a storm cluster.
2, worker (JVM process): Refers to the total number of worker processes throughout the topology, which are randomly distributed evenly to each node.
3. Executor (thread): refers to the number of bus threads of a spout or bolt, which are randomly assigned to each worker.
4. Task (Spout/bolt Instance): Task is an instance of spout and Bolt, and their nexttuple () and execute () methods are called by the executors thread. Unless explicitly specified, storm assigns each executor a task. If more than one task is set, a thread holds more than one Spout/bolt instance.
Note: The above settings are total quantities, which are evenly distributed to the respective host, rather than how many processes/threads per host are to be set. See the example below.


(b) How to set the degree of parallelism
1, node: Buy the machine, and then join the cluster ...
2, Worker:config#setnumworkers () or configuration item topology_workers
3, Executor:Topology.setSpout ()/.setbolt ()
4, Task:componentconfigurationdeclarer#setnumworker ()

(c) Examples:
        3. Create topology        Topologybuilder builder = new Topologybuilder ();        Builder.setspout ("Kafka-reader", New Kafkaspout (spoutconf), 5);//Set Executor quantity to 5        builder.setbolt ("Filter-bolt" , New Filterbolt (), 3). Shufflegrouping (                "Kafka-reader");//Set executor number to 3        builder.setbolt ("Log-splitter", New Logsplitterbolt (), 3)                . Shufflegrouping ("Filter-bolt");//Set executor number to        5 Builder.setbolt ("Hdfs-bolt", Hdfsbolt, 2). Shufflegrouping (                "Log-splitter");//Set executor number to 2        //4, Start topology        config conf = new Config ();        Conf.put (Config.nimbus_host, nimbushost);        Conf.setnumworkers (3);      Set number of worker        Stormsubmitter.submittopologywithprogressbar (topologyname, conf,                builder.createtopology ( ));



1. Set the number of worker processes to 3 through Config.setnumworkers (3), assuming that there are 3 node in the cluster, each node will run a worker.
2, the number of executor are:
Spout:5
Filter-bolt:3
Log-splitter:3
Hdfs-bolt:2
For a total of 13 executor, these 13 executor will be randomly assigned to each worker.
Note: This code reads the message source from Kafka, and the number of partitions in the Kafka is set to 5, so the spout thread bellow is 5.
3. This example does not set the number of tasks individually, that is, using the default configuration for a task per executor. If you want to set it, you can:
Builder.setbolt ("Log-splitter", New Logsplitterbolt (), 3)
. shufflegrouping ("Filter-bolt"). Setnumtasks (5);
To set up, these 5 tasks are assigned to 3 executor.

(iv) Dynamic adjustment of the degree of parallelism
There are 2 ways to adjust the degree of parallelism of a storm topology:
1. Kill topo->, compile-and-submit topology with modify code
2. Dynamic adjustment
The 1th method is too inconvenient, sometimes topo can not say kill on Kill, in addition, if add a few machines, do you want to kill all topo and change the code?
As a result, Storm provides a dynamic adjustment method with 2 methods of dynamic adjustment:
1, UI mode: Enter a topo page, click Rebalance can, at this time can see topo status is rebalancing. However, this method simply re-allocates the process and thread on each machine, which is suitable for increasing the machine or reducing the machine, and cannot adjust the number of workers, the number of executor, etc.
2. CLI mode: Storm rebalance
As an example,
Storm rebalance toponame-n 7-e filter-bolt=6-e hdfs-bolt=8
Set the number of workers in the Topo to 7 and the executor number of Filter-bolt and Hdfs-bolt to 6, 8, respectively.
At this point, to see the status of Topo is rebalancing, after the adjustment is completed, you can see the number of workers in 3 machines are 3, 2, 2

Storm 8: The degree of parallelism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.