Summary of Spout usage instructions in Storm
In Storm, Spout is used to read and send data sources to the computing topology. Recently, when debugging a topology, a problem occurs that the system qps is low and the processing speed cannot meet the requirements, after troubleshooting, it is found that the multi-thread synchronization wait is caused by improper use of Spout. Here are a few points of special attention when writing Spout code:
1. the most common mode is to use a thread-safe queue, such as BlockingQueue. The spout main thread reads data from the queue; one or more threads are responsible for reading data from data sources (such as various message-oriented middleware and databases) and putting it into the queue.
2. do not enable the ack mechanism if you do not care whether data is lost (for example, a typical scenario of statistical analysis.
3. the nextTuple and ack methods of Spout are executed in the same thread (this may not be a bottleneck at first, but it is a single thread for the sake of simple implementation, jstorm should have been changed to multiple threads). Therefore, you cannot block the current thread in the nextTuple or ack method. This will directly affect the processing speed of spout.
4. when the nextTuple of Spout sends data, it cannot block the current thread (see the previous one). For example, when retrieving data from the queue, use the poll interface instead of take, the poll method tries not to block the parameter for a fixed time. If there is no data in the queue, it will be returned directly. If there are multiple data to be sent, all data will be traversed when nextTuple is called at a time.
5. when calling the nextTuple method after Spout 0.8.1, if there is no emit tuple, it needs to sleep for 1 ms by default. This specific policy is configurable. Therefore, you can use it based on your own scenarios, to make reasonable use of cpu resources.
Thoughts on a scalable real-time data processing architecture based on Storm
How does Storm allocate tasks and load balancing?
Storm Process Communication Mechanism Analysis
Apache Storm History and Lessons
For details about Apache Storm, click here
Apache Storm: click here
This article permanently updates the link address: