Storm Real-life experience

Source: Internet
Author: User
one, using the parallelism of the component instead of the thread pool
Storm itself is a distributed, multi-threaded framework, and we can set its concurrency for each spout and bolt, and it also supports the ability to dynamically adjust the concurrency through the rebalance command to distribute the load across multiple workers.
If you use a thread pool within a component to do some computationally intensive tasks, such as JSON parsing, it is possible to make certain components of the resource consumption is particularly high, the other components are very low, resulting in uneven resource consumption between workers, this situation is more obvious when the component parallelism is lower. For example, if a bolt has 1 degrees of parallelism, but the thread pool is started in the bolt, one consequence is that the worker process that assigned the bolt in the cluster could consume the machine's resources and affect the operation of other topology tasks on the machine. If there is a computationally intensive task, we can set the concurrency of the component to a large number of workers, and let the computation be allocated to multiple nodes.
To prevent certain components of a topology from depleting the entire machine's resources, the resource usage of each worker can be controlled by cgroup, in addition to not starting the thread pool within the component.

second, do not use DRPC batch processing big DataRPC provides an interface between the application and the storm topology that can be called directly by other applications, uses storm concurrency to process the data, and then returns the results to the calling client. In this way, there is usually no problem with a small amount of data, and when you need to deal with large batches of data, the problem is more obvious.
(1) The topology of processing data may not return the computed results until the time-out.
(2) batch processing of data, may make the cluster load temporarily high, after processing, and then reduce back, load balance poor.
Batch processing Big data is not the original intention of storm design, Storm is considering the balance between timeliness and batch, more to the former. To deal with large volumes of data in real time, consider a batch framework such as Spark Stream.
third, do not handle time-consuming operations in the spoutThe Nexttuple method in Spout emits the data stream, and the Fail method and the Ack method are triggered when the ACK is enabled.
It is important to be clear that spout in storm is single-threaded (the spout of Jstorm is divided by 3 threads, performing nexttuple methods, fail methods, and ACK methods respectively). If the Nexttuple method is very time-consuming and a message is executed successfully, Acker will send a message to Spout, Spout if it cannot be consumed in time, it may cause the ACK message to be discarded after timing out, and Spout instead think that the message execution failed, resulting in a logic error. Conversely, if the Fail method or the Ack method is more time consuming, it will affect the amount of spout transmit data, resulting in a decrease in topology throughput.
Four, pay attention to the data balance of fieldsgroupingFieldsgrouping is a grouping of data based on one or more fields, with different target tasks receiving different
Data, and the same task receives the same data.
Suppose a bolt fieldsgrouping the data based on the user ID, if some users have more data, and some users have less data, then it is possible that the data received by the next level processing bolt is unbalanced, and the performance of the entire processing is constrained by some nodes with large data volumes. Can add more grouping conditions or change the packet strategy, so that the data is balanced.
v. Priority use of localorshufflegroupingLocalorshufflegrouping means that if one or more of the task in the target bolt and the task that is currently producing the data are inside the same worker process, then the internal inter-thread communication is sent, and the tuple is addressed directly to the current worker. The purpose task of the process. Otherwise, the same shufflegrouping.
The data transfer performance of the localorshufflegrouping is better than that of shufflegrouping, as it is transmitted within the worker, only through the disruptor queue, without network overhead and serialization overhead. Therefore, when the complexity of data processing is not high, while network overhead and serialization overhead occupy the main position, localorshufflegrouping can be preferred instead of shufflegrouping.
Six, set reasonable maxspoutpending valueWith ACK enabled, a rotatingmap is used in Spout to hold a message that Spout has sent out, but has not yet waited for an ACK result. The maximum number of Rotatingmap is limited, for p*num-tasks. where P is the topology.max.spout.pending value, i.e. maxspoutpending (can also be set by Topologybuilder in Setspout by setmaxspoutpending method), Num-tasks is the spout task number. If you do not set the size of the maxspoutpending or are set too large, you may consume too much memory to cause memory overflow, and the setting is too small will affect the speed of spout to launch a tuple.
Seven, set a reasonable number of workerThe higher the number of workers, the better the performance. Let's look at a curve of worker quantity and throughput comparison (from Jstorm documentation: Https://github.com/alibaba/jstorm/tree/master/docs/0.9.4.1jstorm https:// Github.com/alibaba/jstorm/tree/master/docs/0.9.4.1jstorm performance Test. docx).
As can be seen from the figure, in the case of 12 workers, the throughput is the largest and the overall performance is optimal. This is because on the one hand, each new worker process will change the memory communication between the original threads into inter-process network communication, which also requires serialization and deserialization operations, which reduces throughput.
On the other hand, each new worker process adds additional threads (Netty send and receive threads, heart jumpers, Systembolt threads, and other system components), which consumes a lot of cpu,sys system CPU consumption. Reduces the efficiency of business threads when the total CPU usage is limited.

Eight, balanced throughput and timelinessThe data transfer for Storm uses Netty by default. In terms of data transmission performance, the following parameters can be adjusted:
(1) Storm.messaging.netty.server_worker_threads: To receive the message thread;
(2) Storm.messaging.netty.client_worker_threads: The number of message threads sent;
(3) Netty.transfer.batch.size: Refers to the size of the data sent to Netty Server each time the Netty Client
If a tuple message that needs to be sent is greater than netty.transfer.batch.size, the tuple message is sliced according to Netty.transfer.batch.size and then sent multiple times.
(4) Storm.messaging.netty.buffer_size: A task after a tuple serialized for each bulk shipment
The size of message messages.
(5) Storm.messaging.netty.flush.check.interval.ms: Indicates the frequency at which the Netty Client checks the data that can be sent when a taskmessage needs to be sent.
Reduce the value of storm.messaging.netty.flush.check.interval.ms, can improve the timeliness. Increasing the value of netty.transfer.batch.size and storm.messaging.netty.buffer_size increases the amount of swallowing that can be transmitted over the network, increasing the payload of the network (reducing the number of TCP packets, and TCP The amount of effective data in the package increases), usually the timeliness will be reduced some. Therefore, according to their own business situation, reasonable in the throughput and timeliness of a direct balance.
Excerpt from: Http://www.xiapistudio.com/taste-page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.