There are two important concepts in the operating system: process and thread. Another process is created. Generally, a process symbolizes data resources and corresponds to the memory, simulating the illusion that the entire computer only has the OS and the current application.
The memory is simulated. What about the CPU? After all, the CPU cores in the machine are limited. Most machines have two or four cores. But you may want to run a lot of programs. What should you do? The OS will use a scheduling algorithm to let you execute it for a while, and then let another program execute it for a while, as long as the switching process is fast enough, it makes us feel like all programs are running at the same time. So this will let you execute it for a while and let others execute it for a while and it will need to be abstracted. Otherwise, don't you divide your code into segments. The abstract result is a thread, your code is executed by a thread, which makes you feel like your code has been occupying the entire CPU, however, the CPU switches frequently in different threads ).
We know that to create a desktop software with rapid response, we need to put some time-consuming operations, such as accessing a remote database and accessing large files, in a thread different from the main UI thread, this prevents the main thread from being blocked and causes the UI to be suspended. The user thinks the program is suspended. If this is the case, we think Thread is a good recipe and can make full use of CPU resources. Example:
1: public void btn_Click (object sender, EventArgs e)
2 :{
3: ThreadStart start = new ThreadStart (GetBlogsFromDB );
4: Thread thread = new Thread (start );
5: thread. Start ();
6 :}
7:
8: public void GetBlogsFromDB ()
9 :{
10: //... access remote database
11 :}
Okay, so we got on a multi-threaded light rail train, and the UI responded quickly. Even if we were to access a remote database, our UI was responsive and the mouse did not turn around. However, we are happy to take a look at the overhead of creating a thread:
1. The Thread kernel object operating system allocates such a data structure to each Thread, which also includes the Thread context. The thread context includes the value of the current CPU register. On the x86 CPU, the thread context is about 700 bytes, x64 is 1240, and IA64 is 2500 bytes.
2. Thread environment block (TEB) TEB includes one page of memory (page, 4 kb on x86 and x64, 8 KB on IA64 ), it is initialized in user mode. TEB includes thread Exception Handling links. Every time a thread enters a try, it inserts a line in the head of the chain and deletes the line when exiting the try. TEB also contains thread local storage and some data structures used for GDI.
3. A User mode stack is also called a thread stack. It is used to save the parameters passed to the method, as well as the local variables in the method and the return address of the method. By default, a 1 MB thread stack is allocated to each thread in Windows (that is, why a bad recursive method causes Stack Overflow because of continuous recursion, method parameters, the memory usage of local variables and method return addresses is constantly increasing. If the memory usage exceeds 1 MB, overflow will occur ). There is a difference between hosted code and unmanaged code. In unmanaged code, such as C/C ++, creating a thread windows only keeps 1 MB of address space, the task is submitted only when this thread is required. In the managed code, the task is submitted as long as the thread is created, and 1 MB of physical memory is allocated.
4. The code of the Kernel mode stack application often needs to call the Kernel mode function of the operating system. Based on security considerations, the OS copies the called parameters from the user mode stack to the kernel mode stack. After the copied parameters are complete, the OS checks these parameters. In addition, the methods in kernel mode also need to be called to each other. They rely on this stack to save local variables, method parameters, and return addresses. On 32-bit systems, this stack occupies 12 kb, and on 64-bit systems, it occupies 24 KB.
5. DLL thread-attach and thread-detach events when a new thread is created in the process, the DllMain method of all DLL loaded in the process will be called and a DLL_THREAD_ATTACH flag will be passed, when a thread is dead, it is also called. However, for C # and many other dll written in hosting languages, this notification is not received because there is no DllMain method.
The above lists the overhead required to create a thread or destroy a thread. It seems that thread creation is expensive. However, this is a trivial matter, as we mentioned earlier, because our CPU is limited, the OS only creates the illusion that all programs are running at the same time through frequent thread switching. This thread switch is the legendary context switch. At the given time, windows allocates a thread to the CPU, and then allows it to run a time slice. When the time slice expires, the next thread runs. So in order for the scheduler to schedule it to this thread again, we have to do some special operations to continue running at the place where it was stopped last time:
1. Save the value of the current CPU register in the Context Structure of the currently running thread's kernel object.
2. schedule the next thread to run. If this thread belongs to another process, windows must first switch the virtual address space.
3. Load the Context Structure of the selected thread to the CPU register.
In fact, the above three steps are not too bad. If it is scheduled to another thread, the code and data of the previous thread stored in the CPU cache will be completely invalid, and access from memory directly.
This creates a dilemma. On the one hand, we need to create a thread to create a more robust and more responsive interface, on the other hand, the thread creation overhead is very large, and once a lot of threads are created, switching between these threads will bring about overhead of context switching.
We also encountered this dilemma during database access:
The database connection is very expensive, so we need to open the connection to access data and immediately close it, but creating a connection is also very expensive. At that time, the concept of a database connection pool emerged. Here, we also have a thread pool to save us. Well, that's the end.
Postscript
Recently I am very interested in parallelism and Asynchronization. Unfortunately, I have not mastered these basic computer theories well enough. So let's take a look. If you have any questions, please do not give me any further information, or share some parallel and asynchronous papers. Let me read and read them. Thank you ~~