Summary of Spout usage instructions in Storm

Source: Internet
Author: User

Summary of Spout usage instructions in Storm

In Storm, Spout is used to read and send data sources to the computing topology. Recently, when debugging a topology, a problem occurs that the system qps is low and the processing speed cannot meet the requirements, after troubleshooting, it is found that the multi-thread synchronization wait is caused by improper use of Spout. Here are a few points of special attention when writing Spout code:

1. the most common mode is to use a thread-safe queue, such as BlockingQueue. The spout main thread reads data from the queue; one or more threads are responsible for reading data from data sources (such as various message-oriented middleware and databases) and putting it into the queue.

2. do not enable the ack mechanism if you do not care whether data is lost (for example, a typical scenario of statistical analysis.

3. the nextTuple and ack methods of Spout are executed in the same thread (this may not be a bottleneck at first, but it is a single thread for the sake of simple implementation, jstorm should have been changed to multiple threads). Therefore, you cannot block the current thread in the nextTuple or ack method. This will directly affect the processing speed of spout.

4. when the nextTuple of Spout sends data, it cannot block the current thread (see the previous one). For example, when retrieving data from the queue, use the poll interface instead of take, the poll method tries not to block the parameter for a fixed time. If there is no data in the queue, it will be returned directly. If there are multiple data to be sent, all data will be traversed when nextTuple is called at a time.

5. when calling the nextTuple method after Spout 0.8.1, if there is no emit tuple, it needs to sleep for 1 ms by default. This specific policy is configurable. Therefore, you can use it based on your own scenarios, to make reasonable use of cpu resources.

Thoughts on a scalable real-time data processing architecture based on Storm

How does Storm allocate tasks and load balancing?

Storm Process Communication Mechanism Analysis

Apache Storm History and Lessons

For details about Apache Storm, click here
Apache Storm: click here

This article permanently updates the link address:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.