How to reasonably estimate the thread pool size?
Although this problem seems small, it is not so easy to answer. If you have a better way to welcome us, let's start with a naïve estimation method: Assume that a system's TPS (Transaction per Second or task per Second) is required at least 20, and then assume that each Transaction is done by a single thread, Continue to assume that the average time for each thread to process a transaction is 4s. Then the problem translates to:
How to design a thread pool size so that 20 transaction can be processed within 1s?
The calculation process is simple, each thread has a processing capacity of 0.25TPS, so to reach 20TPS, it is obvious that 20/0.25=80 threads are required.
Obviously this estimate is naïve, because it doesn't take into account the number of CPUs. The average server has a CPU core number of 16 or 32, and if there are 80 threads, there is definitely too much unnecessary thread context switching overhead.
A second simple but unknown method (n is the total number of CPUs):
If it is a CPU-intensive application, the thread pool size is set to N+1
If it is an IO-intensive application, the thread pool size is set to 2n+1
If only one application is deployed on a single server and there is only one thread pool, then this estimate may be reasonable and should be tested on your own.
Next, in this document: Server performance IO optimization, an estimate formula is found:
1. Number of best threads = ((thread waiting time + thread CPU time)/thread CPU Time) * Number of CPUs
For example, the average CPU run time per thread is 0.5s, while the thread wait time (non-CPU run time, such as IO) is 1.5S,CPU core number 8, then according to the above formula is estimated to be: ((0.5+1.5)/0.5) *8=32. This formula is further translated into:
2. number of best threads = (ratio of thread wait time to thread CPU time + 1) * Number of CPUs
A conclusion can be drawn: The higher the percentage of thread wait time, the more threads are required. The higher the percentage of thread CPU time, the fewer threads are required.
The previous estimation method also coincides with this conclusion.
The fastest part of a system is the CPU, so deciding on a system throughput limit is CPU. Increases CPU processing power, which can increase the system throughput limit. However, according to the short board effect, the real system throughput can not be calculated based solely on the CPU. To increase the throughput of the system, you need to start with "system short board" (such as network latency, IO):
Maximize parallelism ratios for short board operations, such as multi-threaded download technology
Improved short-board capability, such as using NIO instead of IO
The first can be linked to the Amdahl law, which defines the computational formula for the speedup of a serial system after parallelization:
1. Acceleration ratio = system time-consuming/optimized before optimization
The greater the acceleration ratio, the better the optimization effect of the system parallelism. The Addahl law also gives the relationship between the system parallelism, the number of CPUs and the speedup ratio, the speedup ratio of speedup, the system serialization ratio (the ratio of serial execution code) to the number of F,cpu N:
Speedup <=
1
/ (F + (
1
-F)/N)
When n is large enough, the smaller the serialization ratio F, the greater the speedup than the speedup.
Writing here, I suddenly popped a question.
Is it more efficient to use a thread pool than to use a single thread?
The answer is no, such as Redis is single-threaded, but it is very efficient, basic operations can reach 100,000 levels/s. From the thread point of view, part of the reason is:
Of course "Redis is fast" is more essential because Redis is basically a memory operation, in which case a single thread can use the CPU efficiently. and the multi-threaded application scenario is generally: there is a considerable proportion of IO and network operations.
So even with the simple estimate above, it may seem reasonable, but in fact it may not be reasonable, all need to combine the real situation of the system (such as IO-intensive or CPU-intensive or pure memory operations) and hardware environment (CPU, memory, hard disk read and write speed, Network conditions, etc.) to constantly try to achieve a reasonable estimate of the actual value.
How to reasonably estimate the thread pool size?