Multithreaded programming under Windows (i)

Source: Internet
Author: User
Tags lua cpu usage high cpu usage

Objective

Mastering multithreaded programming under Windows allows us to write more standardized multithreaded code to avoid exceptions. Multithreaded programming under Windows is complex, but understanding some common features has been able to meet the performance and other requirements of our common multithreading.

Process and thread 1. The concept of a process

A process is a running program. Mainly consists of two parts:

? One is the kernel object that the operating system uses to manage processes. Kernel objects are also places that the system uses to hold statistical information about the process.

? The other is the address space, which contains code and data for all executable modules or D-l modules. It also contains dynamic internal

2. Threading Concepts

A thread is an execution path that describes a process, an execution path for in-process code. A process has at least one main thread and can have multiple threads. Threads share all the resources of a process. Threads consist mainly of two parts:

? One is the thread's kernel object, which the operating system uses to manage threads. Kernel objects are also used by the system to store

where thread statistics are.

? The other is the thread stack, which maintains all the function parameters and local variables that the thread needs to execute the code.

3. The pros and cons of processes and threads

Processes use more system resources because each process requires a separate address space. Threads have only one kernel object and one stack. If there is space resources and operational efficiency considerations, the use of multithreading is preferred. Because each address has its own independent process space, each process is independent of each other. While all threads in a process are the address space of a pooled process, a problem with one thread can affect all threads. Like a multi-tabbed browser, it's easy to meet a fake animation that makes the whole browsing unusable. So every tab, such as a 360 browser, is a process, so a problem with a tabbed page doesn't affect other tab pages.

4. How many threads can a process create

In 32-bit Windows, 0~4g linear memory space. 0~2G is the application memory space (in which each process has a separate memory space) and the 2g~4g is the system kernel space (the kernel process is fully shared). So the maximum available memory of the process is 2G, the default size of each line stacks is 1MB, theoretically create up to 2048 threads, there are some other places in the actual process to occupy memory, so the total number of threads that can be created is about 2000. Of course, if you want to create more threads, you can reduce the stack size of the threads.

Thread-related function 1. Creating and terminating thread creation APIs for threads

HANDLE CreateThread(

Lpsecurity_attributes Lpthreadattributes,

size_t Dwstacksize,

Lpthread_start_routine Lpstartaddress,

LPVOID Lpparameter,

DWORD dwCreationFlags,

Lpdword Lpthreadid);

? Lpthreadattributes, which describes a thread-safe struct, is passed null by default.

? Dwstacksize, stack size, default 1MB.

? Lpstartaddress, the thread function entry address.

? Lpparameter, the thread function parameter.

? dwCreationFlags, the state at which the thread was created, 0 indicates that the thread was created immediately after its creation. Create_suspended indicates that the thread has finished creating a hang until the call to ResumeThread is run.

? Lpthreadid, which points to 1 variables that accept the thread ID, can be null.

Thread Termination API

void ExitThread(DWORD dwexitcode);

The function will force the termination of the thread and cause the system to purge all operating system resources used by the thread. However, a C + + object may not be properly freed because the destructor has not been properly called. Additional exit codes can be obtained using the getexitcodethread () function. It is not recommended to use this thread to terminate a function because it can cause the resource not to be released correctly, and generally causes the thread to exit normally. Also, use _endthreadex (without _endthread), even if you want to force termination of a thread, because it takes into account the security of multithreaded resources.

BOOL terminatethread(HANDLE hthread, DWORD dwexitcode);

The function is also forced to exit the thread, except that the function is asynchronous, that is, it tells the system to terminate the specified thread, but there is no guarantee that the thread has been terminated when the function returns. The caller must therefore use the WaitForSingleObject function to determine whether the thread terminates. Therefore, the thread stack resources that are terminated after this function call are not freed. It is generally not recommended to use this function.

2. Thread Safety

There is no specific explanation for thread safety, and the operation of thread functions is simply safe. The main objects Here are: variables, functions, class objects.

Thread-Safe variables

The variable here refers to a global variable/static variable of a non-custom type, or a variable passed in through a thread parameter.

? All threads Read Only the variable, then the variable is thread-safe.

? There are 1 threads to write the variable, and the other thread reads the variable. In this case, you need to consider volatile. When a thread code reads the value of a variable multiple times, the compiler by default optimizes the code to read the value from memory only for the 1th time, and the other time it is read directly from the register. This way, if other threads update the value of the variable, the read thread may still be read from the register. This time you need to tell the compiler that the variable is not optimized and is always read from memory. Efficiency may be lower, but it is more important to ensure that the variables in the thread are safe.

When multiple threads write to the variable at the same time, the critical section read-write lock must be considered.

Thread-Safe functions

Before multithreading occurs, there is a C + + runtime library that is not necessarily thread-safe. For example, GetLastError () gets a global variable value, which can be an error for multithreading. In response to this problem, MS provides a multi-threaded runtime library for C/+ + and needs to match the corresponding multithreaded creation function.

? _beginthreadex

It is not recommended to use _beginthread because it is an early immature function because it ends the handle immediately after the completion of the thread, causing the thread to not be effectively controlled. The C/s + + run-time library function _beginthreadex is the encapsulation of the operating system function CreateThread, and the thread store (TLS) is used here to ensure that each thread has its own separate shared variables. For example, variables such as GetLastError () are used. This allows each thread to ensure that all API functions are thread-safe.

? AfxBeginThread

If the current code environment is based on the MFC library, then the multi-threading creation function must use the MFC library function AfxBeginThread. This is because the MFC library is a re-encapsulation of A/C + + Runtime library and also faces some thread-unsafe variables that exist in the MFC library itself. AfxBeginThread is actually a re-encapsulation of the _beginthreadex function that completes some of the operations that are safely loaded into the MFC DLL library before calling _beginthreadex. This makes the invocation of the library function based on MFC secure.

Thread-Safe classes

In addition to C + + runtime libraries, MFC libraries because there is already processing thread safety, other third-party libraries, even including STL, are not thread-safe. These custom class libraries require self-consideration for thread safety. This can be solved by using kernel objects such as lock, synchronous, and asynchronous, but also with TLS.

3. Pausing and resuming a thread

Inside a thread kernel object, there is a value that indicates the pause count of the threads. When the CreateThread function is called, the kernel object of the thread is created, and the pause count in the kernel object is initialized to 1 so that the operating system does not allocate time slices to the thread. When the thread that is created specifies the create_suspened flag, the thread is paused, and at this point the thread can be given some other initialization such as priority setting. After initialization is complete, you can call ResumeThread to recover. A single thread can be temporarily multiple times, and if paused 3 times, a resumethread recovery of 3 times is required to re-enable the thread to get the time slice.

In addition to creating threads that specify create_suspened to pause threads, you can also call SuspendThread to temporarily thread. When calling SuspendThread, because you do not know what the current thread is doing, if the memory allocation is in progress or is in a lock operation, it may cause other thread locks to die. Therefore, when using suspendthread, we must strengthen measures to avoid the problems that may arise.

User mode and kernel mode

The processor in the computer running Windows has two different modes: User mode and kernel mode. Depending on the type of code running on the processor, the processor switches between two modes. The application runs in user mode, and the core operating system components run in kernel mode. Multiple drivers run in the kernel mode, but some drivers run in user mode.

1. User mode

When you start a user-mode application, Windows creates a "process" for the application. The process provides a dedicated "virtual address space" and a dedicated "handle table" for the application. Because the application's virtual address space is private, an application cannot change data that belongs to other applications. Each application runs in isolation, and if an application is corrupted, the damage is limited to that application. Other applications and operating systems are not affected by this corruption.

The virtual address space of a user-mode application is limited in addition to the private space. A processor that is running in user mode cannot access the virtual address reserved for that operating system. Restricting the virtual address space of a user-mode application prevents application changes and may damage critical operating system data.

2. Kernel mode

Implement some of the underlying services of the operating system, such as thread scheduling, multiprocessor synchronization, interrupt/exception handling, and more.

3. Kernel objects

As the name implies, the kernel object is the object created by the kernel. Because the data structure of the kernel object can only be accessed by the kernel, the application cannot find the content in memory. Because the kernel is used to create objects, it is necessary to switch from user mode to kernel mode, while switching from user mode to kernel mode takes hundreds of clock cycles. Build and manipulate several types of kernel objects, such as Access symbol objects, event objects, file objects, file mapping objects, I/O completion Port objects, Job objects, mailbox objects, mutex objects, pipe objects, process objects, beacon objects, thread objects, and wait timer objects. Kernel objects are cross-process, so cross-processes can use kernel objects for communication.

Time slices and atomic operations 1. Time slices

Early CPUs are single-core, so it is impossible to do real multithreading. A time slice is a time slice in which the operation divides the CPU time into roughly the same length of time. Multithreading is mainly through the operating system to constantly switch time to different threads, so that the thread quickly alternately run, because the time is very short, the user looks like a few threads at the same time running. Of course, the CPU now has multi-core multi-threading, can do real multi-threading. You can use Setthreadaffinitymask to specify that threads run on different CPUs.

Sleep (0), when 1 threads have a large amount of computation, can easily lead to high CPU usage, while other process threads do not get time slices. This time the call to sleep (0), quite tells the operating system to re-allocate the time slice, this time the same priority thread may be allocated time slices, slowing down the computational thread to occupy a lot of time slices.

2. Atomic operation

Thread synchronization issues are largely related to atomic access, which means that threads can access resources with the ability to ensure that all other threads do not access the same resources at the same time.

For example:

int g_nval = 0;

DWORD WINAPI ThreadFun1 (plove pparam)

{

g_nval++;

return 0;

}

DWORD WINAPI ThreadFun2 (plove pparam)

{

g_nval++;

return 0;

}

Because the g_nval++ is the memory to take the value of the register before the calculation, because the thread scheduling is not controllable, resulting in the possibility that two threads have been taken from the memory of 0, so that the result of self-added is 1. This is not the same as what we actually wanted for the result 2. To avoid this, atomic operations InterlockedExchangeAdd(g_nval, 1) are required to achieve the effect. When the Interlock function operates on one memory address, it prevents the other CPU from accessing the internal memory address.

Interlockedexchanged/interlockedexchangepointer, the former is the exchange of a value, which is the exchange of a set of values. The effect is that the atom swaps the specified value and returns the original value. So it can have the following applications.

void Fun ()

{

while (InterlockedExchange (&g_bval, true) = = True)

Sleep (0);

Do something

InterlockedExchange (&g_bval, FALSE);

}

The above code can achieve the effect of a lock. Atomic operations do not have to switch to kernel mode, so the speed is faster. But the code above still needs to loop continuously to achieve the desired effect. The critical section, like atomic operations, can operate directly in user mode, and the critical section is directly waiting for the current thread to be allocated without allocating CPU time slices. So the efficiency of the critical area is better.

Thread pool

When threads are frequently created, the creation of a large number of threads consumes a lot of resources, resulting in inefficiencies. You can consider using thread pooling at this time. The main principle of the thread pool is that the created thread is temporarily not destroyed and joins the list of idle threads. When a new thread needs to be created, the priority goes to the list of idle threads to query whether there are idle threads, which are used directly if no new threads are created. This allows the frequent creation and destruction of fewer threads to be achieved.

Co-process

Like Python, LUA provides the co-process, especially LUA, because it does not have multiple threads, so it relies heavily on the co-process, and Lua is the scripting language that makes the process better. Co-Libraries are available for third-party implementations like other languages. Windows multithreading is provided by the kernel, so creating multithreading requires switching to kernel mode, since switching from user mode to kernel mode takes hundreds of clock cycles. A lightweight class of multi-threading, which is provided directly by user mode, is in fact a co-process (Coroutine). Specifically, the function A calls the process function B, then B executes to line 5th break return function A to continue executing other function C, and then next time to call to B, this time from the B function of the 5th line to execute. It looks like the first execution of the function B, executes a part, interrupts to execute C, executes C and then executes B from the last position. It looks like a primitive multi-threading, which is actually using synchronization to achieve the asynchronous effect. The main implementation principle of C + + is to save the function's register context and stack, and the next time the function is executed, the register context and stack are restored first, and then the function that was last executed is resumed. If you have large-scale concurrency and do not want to call multiple threads frequently, consider using a co-process.

Multithreaded programming under Windows (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.