Twitter Storm series storm environment configuration and throughput test tuning-Personal understanding

Source: Internet
Author: User
1. Hardware configuration information

6 servers, 2 cpu,96g,6 cores, 24 threads 2, cluster information

Storm cluster:1 x nimbus,6 Supervisor

nimbus:192.168.7.127

Supervisor

192.168.7.128

192.168.7.129

192.168.7.130

192.168.7.131

192.168.7.132

192.168.7.133

Zookeeper cluster:

3 nodes

192.168.7.127:2181, 192.168.7.128:2181, 192.168.7.129:2181

Kafka cluster:

7 nodes

192.168.7.127:9092

192.168.7.128:9092

192.168.7.129:9092

192.168.7.130:9092

192.168.7.131:9092

192.168.7.132:9092

192.168.7.133:9092 3, configuration Relationship resolution

The following information can be calculated according to the hardware configuration of the server:

1, the worker and slot relationship is one by one corresponding, a worker occupies a slot. Calculating the number of worker and slots in a cluster is generally calculated as the number of CPU threads per server.

As the above environment is

Worker, slot:144 (6 supervisor, each supervisor is a cpu,24*6=144 of 24 threads)

2, spout concurrency number, that is, setspout after the parameters of the------builder.setspout ("words", newkafkaspout (Kafkaconfig), 10);

Here I am testing, is using Kafka and storm to do data transmission, Kafka has a partition mechanism, spout the number of threads according to Kafka topic number of partition

To define, typically a 1:1 relationship, that is, the current topic partition number is 18, then the number of spout threads can be set to 18. Can be a little more than this, but not

How many Kafka do you need? Partition you can do the test according to your needs. Find the values you need

3, the concurrent number of bolts------Builder.setbolt ("words", Newkafkabolt (), 10);

Bolt concurrency, determines the processing efficiency, bolt concurrency is 1, the face of large data volume may be very slow, bolt concurrency high, also not good, may be as a waste of resources.

Specific values need to be tested and determined

3. Throughput test (only some of the scenarios are listed below.) See attached for all test data)

Test Scenario 1:

Partition:20

Worker:10

Spout:20

Bolt:1

Calculation Result:

Test Scenario 2:

Partition:20

Worker:20

Spout:20

Bolt:1

Test results:


Scenario 3: (The data generator executes on 128-132, each program 100 out-of-the-box, resources also have a certain occupancy, so the actual results may be better than the test results)

Topic 5

Partition 20

Spout 20

Worker 20

Bolt 1

Test results:


Summary results:

5 topic,20 partition,20*5 worker,20*5 a spout,1*5 bolt

Total Throughput =5.04+4.02+5.76+6.31+4.99=26.12

Throughput of 261,200 per second

Daily throughput of nearly 22.6 billion

Summarize:

There are several factors that affect storm throughput: spout concurrency, number of workers (linked to slots), number of partition Kafka

In fact, the number of concurrent spout and Kafka partition is linked.

It is important to note that increasing the number of workers can increase throughput, but be aware that the number of workers is tied to the number of machines in the cluster and is limited.

So you need to pass the test to set a value that you think is reasonable, because if a task has too many workers set up, the worker that leaves the other task

The smaller the number, the less tasks you will run. So as long as the business needs to meet the value of the best;

The specific test results look at the annex;

Reprint please specify source address:

http://blog.csdn.net/weijonathan/article/details/38536671

Http://www.51studyit.com/html/notes/20140813/1054.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.