Run Library and multithreading

Source: Internet
Author: User
Tags exception handling mathematical functions memory usage mixed strtok versions wrapper
11.3 Run Library and multithreading Multithreading problems of 11.3.1 CRT access permissions for Threads

The ability to access a thread is very free, it can access all the data in the process memory, even the stack of other threads (if it knows the stack address of other threads, but this is a rare case), but the actual use of the thread also has its own private storage space, including:

The L stack (although not completely inaccessible to other threads, can still be considered private data in general).

L Thread-local store (thread local Storage, TLS). Thread-local storage is a private space that some operating systems provide separately for threads, but usually only have a very limited size.

L registers (including PC registers), which are the basic data of the execution stream, are therefore private to the thread.

From the C Programmer's point of view, the data threads are private as shown in table 11-3.

Table 11-3

Thread Private

Sharing between threads (process-owned)

Local variables

Parameters of the function

TLS data

Global variables

The data on the heap

Static variables in a function

program code, any thread has the right to read and execute any code

Open file, a thread open file can be read and written by B thread

multi-threaded Runtime library

The existing version of the C + + standard (specifically c++03, C89, C99) can be used to say nothing about multithreading, so the corresponding T/C + + runtime will not be able to provide any help to the thread, which means that the function of creating, ending, synchronizing threads cannot be found in the runtime. The thread-related parts of the C + + standard library, which are not part of the standard library, belong to the system-related libraries outside the standard library as well as the network and graphic images. Because multithreading occupies a very important position in modern program design, the mainstream C run-time library will consider multithreading related content in design. Here we say "multi-threaded correlation" mainly has two aspects, on the one hand is to provide those multi-threaded operation interface, such as the creation of threads, exit thread, set thread priority and other functions interface, on the other hand, the C run-time library itself to be able to run correctly in multi-threaded environment.

For the first aspect, the mainstream CRT will have the corresponding function. For example, under Windows, the MSVC CRT provides functions such as _beginthread (), _endthread () for thread creation and exit, and GLIBC provides an optional line libraries Pthread (POSIX thread) under Linux. It provides functions such as pthread_create (), pthread_exit () for thread creation and exit. Obviously, these functions are not part of the standard runtime, they are all platform-dependent.

What does it mean for the second aspect that the C language runtime must support multi-threaded environments? In fact, the CRT was originally designed without a multithreaded environment, because there was no concept of multithreading at all. To later multi-threaded in the program more and more popular, the C + + Runtime library in the multi-threaded environment to eat a lot of pain. For example:

(1) errno: In the C standard library, most error codes are assigned to a global variable named errno before the function returns. When multithreading is concurrent, it is possible that the value of the errno of a thread is overwritten by the B thread before it is acquired, resulting in an error message.

(2) functions such as strtok () use local static variables inside the function to store the position of the string, and different threads call the function to confuse the local static variable inside it.

(3) Malloc/new and Free/delete: The heap allocation/deallocation function or keyword is thread insecure without locking. Because these functions or keywords are called very frequently, it is cumbersome to ensure thread safety.

(4) Exception handling: In the earlier C + + runtime, exceptions thrown by different threads conflict with each other, resulting in information loss.

(5) printf/fprintf and other IO functions: Stream output functions are also thread insecure, because they share the same console or file output. When different outputs are concurrent, the information is mixed together.

(6) Other thread unsafe functions: includes some functions related to the signal.

Typically, a function in the C standard library that naturally has thread-safe properties without thread security is (regardless of the errno factor):

(1) Character processing (ctype.h), including IsDigit, ToUpper, etc., these functions are also reentrant.

(2) String processing functions (string.h), including Strlen, strcmp, and so on, but functions that involve writing to an array in a parameter (such as strcpy) can be concurrent only if the array in the parameter is different.

(3) Mathematical functions (math.h), including sin, pow, etc., these functions are also reentrant.

(4) string-to-integer/floating-point number (stdlib.h), including Atof, Atoi, Atol, Strtod, Strtol, Strtoul.

(5) Gets the environment variable (stdlib.h), including the getenv, which is also reentrant.

(6) Variable-length array helper function (Stdarg.h).

(7) nonlocal jump Function (setjmp.h), including setjmp and longjmp, provided that longjmp jumps only to the jmpbuf set by this thread.

To address the plight of the C standard library in a multithreaded environment, many compilers came with a multithreaded version of the runtime. In Msvc, you can specify the use of multithreaded runtimes with parameters such as/MT or/MTD. 11.3.2 CRT Improvements using TLS

What are the improvements to the multi-threaded runtime library? First, errno must be a private member of each thread. In glibc, errno is defined as a macro, as follows:

#define ERRNO (*__errno_location ())

The function __errno_location has different definitions under different library versions, and in a single-threaded version, it simply returns the address of the global variable errno. In multi-threaded versions, the addresses returned by different thread invocation __errno_location are not the same. In Msvc, errno is also a macro, which is implemented in a similar way as glibc. Locking

In a multithreaded version of the runtime, the thread-unsafe functions are automatically locked inside, including malloc, printf, and so on, and exception handling errors are resolved early. Therefore, when using the multithreaded version of the runtime, there is no concurrency violation, even if no lock is made before or after malloc/new. Improved function invocation method

C-language runtime in order to support multithreaded features, some improvements must be made. An improved approach is to modify the parameter list of all the unsafe functions of the thread and change it to a thread-safe version. For example, the MSVC CRT provides a thread-safe version of the Strtok () function: strtok_s, their prototypes are as follows:

Char *strtok (char *strtoken, const char *strdelimit);

Char *strtok_s (char *strtoken, const char *strdelimit, char **context);

The improved strtok_s adds a parameter to the context that is provided by the caller with a char* pointer, strtok_s saves the string position after each invocation in the pointer. The previous version of the Strtok function saves this position in a static local variable inside a function, and if there are multiple threads calling the function at the same time, there is a possibility of a conflict. Similar to the msvc CRT, GLIBC also provides a thread-safe version of Strtok () called Strtok_r ().

But many times it is not feasible to change the standard library functions. Standard library is called "standard", that is, it has certain authority and stability, can not be arbitrarily changed. If you change at will, then all procedures that follow this standard need to be re-modified, and this "standard" is not worthy of being followed is debatable. So a better approach is not to change any standard library functions of the prototype, but the implementation of the standard library to make some improvements, so that it can be in a multi-threaded environment can also run smoothly, to be backward compatible. 11.3.3 Thread-local storage implementation

Many times, developers want to store some thread-private data when they are writing multi-threaded program. We know that the data belonging to each thread is private, including the thread's stack and the current register, but both are very unreliable, and the stack is changed at the time each function exits and enters, and the register is less pathetic, and it is not possible to take the register to store the required data. Let's say we want to use a global variable in the thread, but we want the global variable to be private, not shared by all threads. This will require the use of thread-local storage (Tls,thread local Storage). The use of TLS is simple, if you want to define a global variable that is of type TLS, just precede it with the appropriate keyword. For GCC, this keyword is __thread, such as we define a TLS global integer variable:

__thread int number;

For msvc, the corresponding keyword is __declspec (thread):

__declspec (thread) int number;

In Windows Vista and the operating system prior to 2008, if the global variable of TLS is defined in a DLL, and the DLL is explicitly loaded using LoadLibrary (), then the global variable will not be available, and if accessing the global variable will cause a program protection error. The main reason for this is that the DLLs that are defined by __declspec (thread) cannot be properly initialized by the DLL when using LoadLibrary () mount under the operating system prior to Windows Vista, refer to MSDN.

Once a global variable is defined as a TLS type, each thread will have a copy of the variable, and any modification to that variable by any thread will not affect the copy of the variable in the other thread. implementation of Windows TLS

For Windows systems, a global or static variable is normally placed in the ". Data" or ". BSS" segment, but when we use __declspec (thread) to define a thread-private variable, the compiler places these variables in the ". TLS" of the PE file. "section. When the system starts a new thread, it allocates a sufficient amount of space from the process's heap and copies the contents of the ". TLS" segment into this space, so that each thread has its own separate ". TLS" copy. So for the same variable defined with __declspec (thread), they are not the same address in different threads.

We know that for a TLS variable, it could be a C + + global object, so that each thread is not just replicating ". TLS" at startup, it needs to initialize these TLS objects, and must call their global constructors one by one, and when the thread exits, It is also necessary to deconstruct them one by one, just as normal global objects are constructed and refactored as the process starts and exits.

There is a structure called data directory in the structure of the Windows PE file, which we have already covered in the 2nd part. It has a total of 16 elements, one of which is labeled Image_direct_entry_tls, and the address and length stored in this element are the address and length of the TLS table (image_tls_directory structure). The TLS table holds the address of the constructors and destructors for all TLS variables, and the Windows system constructs and reconstructs the TLS variables each time the thread starts or exits, based on the contents of the TLS table. The TLS table itself is often located in the ". Rdata" segment of the PE file.

Another problem is that since the same TLS variable is different for each thread saying their addresses, then how does the thread access the variables? In fact, for each Windows thread, the system creates a structure about thread information called the Thread Environment block (teb,thread environment blocks). This structure holds information about the thread's stack address, thread ID, and so on, where a domain is a TLS array, and its offset in TEB is 0x2c. For each thread, the segment referred to by the FS segment Register of x86 is the teb of the thread, so that the address of the TLS array of the threads can be accessed through fs:[0x2c].

TEB This structure is not public, it may vary with the version of Windows, the TEB structure we are referring to is in the x86 version of Windows XP.

This TLS array is a fixed size for each thread and typically has 64 elements. The first element of the TLS array is the address of the ". TLS" copy that points to the thread. So the step to get a TLS variable address is to get the address of the TLS array first through FS:[0X2C] and then the address of the ". TLS" copy based on the address of the TLS array, and then add the offset in the ". TLS" segment to the address in the thread of the TLS variable. Let's look at a simple example:

__declspec (thread) int t = 1;

int main ()

{

t = 2;

return 0;

}

After compiling, the assembly of this Code is implemented as follows:

_main:

00000000:55 Push EBP

00000001:8b EC mov ebp,esp

00000003:A1-XX Eax,dword ptr [__tls_index]

00000008:64 8B 0D xx xx ecx,dword ptr Fs:[__tls_array]

00

0000000f:8b-Bayi mov edx,dword ptr [ecx+eax*4]

00000012:C7----------The Mov DWORD ptr _t[edx],2

02 00 00 00

0000001c:33 C0 xor Eax,eax

0000001E:5D Pop EBP

0000001F:C3 ret

The code has two symbols __tls_index and __tls_array, which are defined in the Msvc CRT, and for msvc 2008来, their values are 0 and 0x2c, respectively, representing the first element under the TLS array and the offset of the TLS array in TEB. Since these values are likely to change with Windows systems, they are stored in the CRT, and if the program is linked in a DLL, then running on a different version of the Windows platform is not a problem, and if it is a static link, Then, when the new version of Windows changes the TEB structure and causes the offset of the TLS array to change in the TEB, the program may run out of error. Of course, because of Windows ' good performance over the years, this random change in the core data structure is likely to happen in a relatively small way. an explicit TLS

The previously mentioned method of defining a global variable as a TLS variable using the __thread or __declspec (thread) keyword is often referred to as implicit TLS, where programmers do not have to care about the application of TLS variables, assignment assignments and releases, compilers, The runtime and the operating system have quietly handled all this. In the programmer's view, the TLS global variable is a thread-private global variable. In contrast to implicitly TLS, there is also a method called explicit TLS, which requires the programmer to manually request the TLS variable, and each time the variable is accessed, the corresponding function is called to get the address of the variable, and the variable needs to be freed after the access is complete. On the Windows platform, the system provides 4 API functions for TlsAlloc (), TlsGetValue (), TlsSetValue (), and TlsFree () for application, value, assignment, and deallocation of explicit TLS variables The library functions corresponding to Linux are Pthread_key_create (), pthread_getspecific (), pthread_setspecific (), and Pthread_key_delete () in the Pthread library.

The explicit TLS implementation is actually very simple, as we mentioned earlier in the TEB structure with a TLS array. In fact, explicit TLS is the use of this array to store TLS data. Because the number of elements of the TLS array is fixed, typically 64, so when the explicit TLS is implemented, if it is found that the array has been used, an additional 4,096 bytes will be requested as a two-level TLS array, so that under WindowsXP can have up to 1088 (1024+ 64) An explicit TLS variable (of course, the implicit TLS also consumes the TLS array). Relative to the implicitly type of TLS variable, the use of explicit TLS variables is cumbersome, and there are many limitations, the many shortcomings of explicit TLS has made it increasingly unpopular, we do not recommend the use of it.

What's the difference between Q&a:createthread () and _beginthread ()

We know that there are two ways to create a thread under Windows, one is to call Windows API CreateThread () to create a thread, and the other is to call the Msvc CRT's function _beginthread () or _beginthreadex ( ) to create a thread. The corresponding exit thread also has two functions for Windows API ExitThread () and CRT _endthread (). These two sets of functions are used to create and exit threads, what is the difference between them?

Many developers do not know the relationship between the two, they randomly choose a function to use, found that there is no big problem, so busy to solve the more urgent task, and did not delve into them. When one day suddenly found a program running for a long time there will be a small memory leak, developers will never think because the two sets of functions with mixed results.

Depending on the Windows API and the MSVC CRT relationship, it can be seen that _beginthread () is a wrapper over CreateThread (), which ultimately calls CreateThread () to create the thread. So what did you do before _beginthread () called CreateThread ()? We can take a look at the source code of _beginthread (), which is located in the thread.c of the CRT source code. We can see that it applied for a structure called _tiddata before calling CreateThread (), and then passed the structure to _beginthread () 's Own thread entry function _threadstart after initializing it with the _INITPTD () function. _threadstart first saves the _TIDDATA structure pointer passed by _beginthread () to the explicit TLS array of the thread, and then it invokes the user's thread entry to actually start the thread. After the user thread ends, the _threadstart () function calls _endthread () to end the thread. And _threadstart also uses __try/__except to wrap the user thread entry function, which captures all the unhandled signals and gives them to the CRT for processing.

So in addition to the signal, it is clear that the main purpose of the CRT wrapper Windows API Threading interface is the _tiddata. What is stored inside this thread's private structure? We can find its definition from mtdll.h, which holds the previous call position of the thread ID, thread handle, Erron, Strtok (), the Seed of the rand () function, the exception handling and the CRT-related and thread-private information. The MSVC CRT does not use the __declspec (thread) method we described earlier to define thread-private variables, which prevents library functions from being invalidated under multithreading, but instead applies a _tiddata structure on the heap, placing thread-private variables inside the structure. A pointer to _tiddata saved by an explicit TLS.

Knowing this information, we should think of a problem, that is, if we use CreateThread () to create a thread and then call the CRT's Strtok () function, it should be wrong, because strtok () required _tiddata does not exist, But we never seem to have encountered such a problem. Looking at the Strtok () function, you will find that when you first call _GETPTD () to get the thread's _TIDDATA structure, the function will apply for the structure and initialize it if it finds that the thread has not applied for the _tiddata structure. So no matter which function we invoke to create the thread, all functions that need to be _tiddata can be safely called, because once the structure does not exist, it is created.

So when will the _tiddata be released? ExitThread () certainly will not, because it does not know that there is _tiddata such a structure exists, then it is obviously _endthread () release, which is the CRT practice. However, we often find that even with CreateThread () and ExitThread () (without calling ExitThread () to exit the thread function directly), there is no memory leak, which is why. After careful examination, we found that the original password in the CRT DLL entry function DllMain. We know that when a process/thread starts or exits, each DLL's DllMain is called once, so the dynamic-link version of the CRT has the opportunity to release the thread's _tiddata in DllMain. But DllMain only when the CRT is a dynamic link version of the time to play a role, static link CRT is not DllMain. This is the case of a memory leak caused by using CreateThread (), in which case the _tiddata cannot be released at the end of the thread, resulting in a leak. We can use the following small program to test:

#include <Windows.h>

#include <process.h>

void thread (void *a)

{

char* r = strtok ("AAA", "B");

ExitThread (0); It doesn't matter if this function is called.

}

int main (int argc, char* argv[])

{

while (1) {

CreateThread (0, 0, (lpthread_start_routine) thread, 0, 0, 0);

Sleep (5);

}

return 0;

}

If there is no problem with the dynamically linked CRT (/MD,/MDD), however, if you use the static link CRT (/MT,/MTD), after you run the program in the process manager, you will see that the memory usage keeps rising, but if we put the thread () The ExitThread () in the function is changed to _endthread (), so there is no problem because _endthread () will release _tiddata ().

This problem can be summed up as: When using the CRT (basically all programs use the CRT), try to use the _beginthread ()/_beginthreadex ()/_endthread ()/_endthreadex () This set of functions to create threads. In MFC, there is a similar set of functions are AfxBeginThread () and AfxEndThread (), according to the above principle analogy, it is the MFC layer of thread wrapper function, they will maintain the thread and MFC-related structure, when we use the MFC class library, Try to use the thread wrapper function it provides to ensure that the program is running correctly.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.