In the previous article, we briefly discussed the role of the thread pool and some of the features of the CLR thread pool. But the basic concept of the thread pool is not over yet, and this time we'll add some of the necessary information to help us choose the right way to use the program.
Independent thread Pool
The last time we discussed, in a. NET application, there is a CLR thread pool that can be used using static methods in the ThreadPool class. As long as we add tasks to the thread pool using the QueueUserWorkItem method, the thread pools are responsible for executing them at the appropriate time. We also discussed some of the advanced features of the CLR thread pool, such as limiting the maximum and minimum number of threads, limiting thread creation time to avoid a burst of heavy tasks consuming too much resources, and so on.
So. What are the drawbacks of the thread pool provided by net? Some friends say that an important drawback is that the function is too simple, for example, there is only one queue, unable to poll multiple queues, cannot cancel tasks, set task priority, cannot limit task execution speed, and so on. But in fact, these simple functions can be achieved by adding a layer on the CLR thread pool (or, by encapsulating the CLR's thread pools). For example, you can make a task that is placed in a pool of CLR threads pick a run from several custom task queues at execution time, thus achieving the effect of polling multiple queues. So, in my opinion, the main disadvantage of the CLR thread pool is not here.
I think the main problem with the CLR thread pool is "unification", which means that almost all of the tasks within the process depend on this thread pool. As mentioned in the previous article, such as timer and WaitForSingleObject, there are also asynchronous invocations of delegates. The NET Framework relies on this thread pool for many functions. This is appropriate, but due to the fact that developers are unable to control the unified thread pool precisely, there are some special needs that cannot be satisfied. One of the most common examples: the ability to control arithmetic. What is computational power? So let's start with the thread 1.
We create a thread in a program, schedule it to a task, and put it to the operating system to schedule execution. The operating system manages all the threads in the system and schedules it in a certain way. What is "scheduling"? Scheduling is the control of the state of a thread: execution, wait, and so on. We all know that theoretically there are a number of processing units (such as 2 * 2 CPU machines have 4 processing units), it means the operating system can do several things at the same time. However, the number of threads will be far more than the number of processing units, so the operating system in order to ensure that each thread is executed, it must wait for a thread on a processor to perform a situation, "swap" a new thread to execute, this is called "context switch". There are several reasons for context switching, which may be the logical decision of a thread, such as locking, or actively entering hibernation (calling the Thread.Sleep method), but it is more likely that the operating system found the thread "timed out". A "time slice (TimeSlice)" 2 is defined in the operating system, and when a thread is found to be executing more than this time, it will be removed and replaced with another one. This looks like multiple threads-that is, multiple tasks are running at the same time.
It is worth mentioning that, for the Windows operating system, its dispatch unit is a thread, which is not related to which process the thread belongs to. For example, if there are only two processes in the system, process A has 5 threads, and process B has 10 threads. In the case of other factors, process B occupies an arithmetic unit at twice times the time of process a. Of course, the reality is not that simple. For example, different processes have different priorities, and the line threads has a priority for the process to which it belongs, and if a thread has not been executed for a long time, or if the thread has just recovered from the "lock" wait, the operating system will temporarily elevate the priority of the thread-all of which involve the running state of the program. Performance and other factors, we have the opportunity to do the unfolding.
Now do you realize what the number of threads means? Yes, that's the "computational power" we just mentioned. Many times we can simply think that in the same environment, the more threads a task uses, the more computing power it gets than the number of other threads. Computing ability naturally involves the speed of task execution. You can imagine that there is a production task, and a consuming task that uses a queue for temporary storage. Ideally, the speed of production and consumption should be the same, thus bringing the best throughput. If the production task executes faster, the queue will accumulate and the consumption task will wait and the throughput will decrease. Therefore, at the time of implementation, we tend to assign separate thread pools for production tasks and consumption tasks, and to balance the pace of production and consumption by increasing or decreasing the number of thread pool threads.
It is common to use a separate thread pool to control the computational power, a typical case is the Seda architecture: The entire architecture is connected by multiple stages, each stage consists of a queue and a separate thread pool, and the adjuster adjusts the number of threads Cheng according to the number of tasks in the queue. The result is an excellent concurrency capability for the application.
In the Windows operating system, Server 2003 and previous versions of the API also provide only a single thread pool within the process, but in the Vista and Server 2008 APIs, in addition to improving the performance of the thread pool, there is an interface for creating multiple thread pools within the same process. It's a pity that. NET up to today's version 4.0, there is still no capability to build a separate thread pool. Constructing an excellent thread pool is a very difficult thing, fortunately, if we need this function, we can use the famous smartthreadpool, after so many years of testing, I believe it is mature enough. If necessary, we can also make some changes to it-after all, in different circumstances, we do not have the same requirements for the thread pool.
IO thread pool
The IO thread pool is the thread pooling for the asynchronous IO service.
The simplest way to access the IO, such as reading a file, is to block, and the code waits for the IO operation to succeed (or fail) before proceeding, and everything is in order. However, there are many drawbacks to blocking IO, such as having the UI stop responding, a context switch, the cache in the CPU may be erased or even the memory being swapped to disk, which is a significant performance-impacting practice. In addition, each IO occupies one thread, which can easily lead to a large number of threads in the system, ultimately limiting the scalability of the application. Therefore, we will use the "asynchronous IO" approach.
When using asynchronous IO, the thread that accesses the IO is not blocked, and the logic goes on. The operating system is responsible for notifying us of the results in a way that, in general, is a "callback function". Asynchronous IO is not an application-intensive thread during execution, so we can launch large amounts of IO with a small number of threads, so the responsiveness of the application can be improved. In addition, the simultaneous launch of a large number of IO operations at some point will have additional performance advantages, such as disk and network can work at the same time without conflicting, the disk can also be based on the location of the head to access the nearest data, rather than according to the order of the request data read, which can effectively reduce the head movement distance.
There are many asynchronous IO methods in the Windows operating system, but the best performance and scalability is the legendary "IO completion port (I/O completion PORT,IOCP)", which is also. The only asynchronous IO method encapsulated in net. About 1.5 ago, Lao Zhao wrote an article, "Correct use of asynchronous operations," which, in addition to describing the differences and effects of compute-intensive and IO-intensive operations, simply describes how IOCP interacts with the CLR, with the following excerpt:
When we want to make an asynchronous io-bound operation, the CLR emits an IRP (via the Windows API) (I/O Request Packet). When the device is ready, it will find an IRP that it "most wants to process" (for example, a request to read the data closest to the current head) and process it, and the device will return (via Windows) an IRP that represents the completion of the work. The CLR creates a IOCP (I/O completion Port) for each process and maintains it with the Windows operating system. Once the IOCP is placed in the completed IRP (through internal Threadpool.bindhandle), the CLR allocates an available thread to continue the task as soon as possible.
In fact, writing IOCP using the Windows API is very complex. And in. NET, due to the need to meet the standard APM (Asynchronous programming model), in the use of convenience, but also give up some control capabilities. As a result, many developers choose to write code directly using native code when some really require high throughput, such as writing a server. But in the vast majority of cases,. NET with IOCP asynchronous IO operation is enough to get very good performance. Using the APM approach to use asynchronous IO in. NET is straightforward, as follows:
static void Main (string[] args) { WebRequest request = httpwebrequest.create ("http://www.cnblogs.com"); Request. BeginGetResponse (handleasynccallback, request);} static void Handleasynccallback (IAsyncResult ar) { WebRequest request = (WebRequest) ar. asyncstate; WebResponse response = Request. EndGetResponse (AR); More operations ...}
BeginGetResponse will initiate an asynchronous IO operation with IOCP and invoke the Handleasynccallback callback function at the end. So, where does this callback function be executed by the thread? Yes, it is the thread of the legendary IO thread pool. NET prepares two thread pools in a single process, in addition to the CLR thread pool mentioned in the previous article, it also prepares an IO thread pooling for callbacks for asynchronous IO operations. The IO thread pool is similar in nature to the CLR thread pool, creates and destroys threads dynamically, and also has the maximum and minimum values (you can refer to the API listed in the previous article).
Unfortunately, the IO thread pool is just the "one whole" thread pool, and the disadvantage of the CLR thread pool is the IO thread pool. For example, after reading a piece of text using asynchronous Io, the next step is often to analyze it, which is a computationally intensive operation. However, for computationally intensive operations, if the entire IO thread pool is used, we cannot effectively control the computational power of a task. So in some cases, we will return the computation task to the separate thread pool within the callback function. In theory this will increase the overhead of thread scheduling, but the actual situation depends on the specific evaluation data. If it really becomes one of the key factors affecting performance, we may need to use native code to invoke the IOCP-related API, handing the callback task directly to the independent thread pool for execution.
We can also use code to manipulate the IO thread pool, such as the following interface to submit a task to the IO thread pool:
public static class threadpool{public static bool Unsafequeuenativeoverlapped (nativeoverlapped* overlapped);}
The nativeoverlapped contains a IOCompletionCallback callback function and a buffer object that can be created by overlapped objects. The overlapped will contain a fixed space where the meaning of "fixed" means that the address will not be changed by GC and will not even be replaced with swap space on the hard disk. The purpose of this is to cater to IOCP's requirements, but it is clear that it will also degrade program performance. Therefore, we will hardly use this method 3 in actual programming.
Related articles
- Discussion on thread pool (top): The role of thread pool and CLR thread pool
- Discussion on thread pool (middle): function of independent thread pool and IO thread pool
- Talking about the thread pool (bottom): Related Test and matters needing attention
Note 1: if not stated, the object we are talking about here defaults to XP and the Windows operating system of the above version.
Note 2:TimeSlice is also known as quantum, and the values defined in different operating systems are not the same. In the Windows client operating system (XP,VISTA), the time slice defaults to 2 clock interval, which defaults to 12 clock interval in the server operating system (2003,2008) (1 clock on the mainstream system) Interval about 10 to 15 milliseconds). The server operating system uses a longer time slice because the typical server runs on fewer programs than the client and focuses more on performance and throughput, while the client system is more responsive-and, if you really need it, the length of the time slice can be adjusted.
Note 3: However, if a single NativeOverlapped object is reused multiple times in a program, the performance of this method is slightly better than that of QueueUserWorkItem, which is said to be used in WCF- There are always some skills inside Microsoft that we do not know how to use, for example, Lao Zhao remembers to look at the ASP. NET AJAX source code, in MSDN accidentally found an interface description is "Reservation method, please do not use externally." What can we do about it?
Discussion on thread pool (middle): function of independent thread pool and IO thread pool