Daniel Robbins
President/CEO, Gentoo Technologies, Inc.
August 2000
POSIX Threads are a powerful means to improve code response and performance. In the second part of this three-part series, Daniel Robbins explains how to use smart gadgets called mutex objects to protect the integrity of shared data structures in thread code.
Mutually Exclusive to me!
In the previous article, we talked about the thread code that causes exceptions. The two threads add one to the same global variable 20 times. The variable value should last 40, but the final value is 21. What's going on? This problem occurs because one thread does not stop "cancel" The addition operation executed by another thread. Now let's check the corrected code. It usesMutex object(Mutex) to solve the problem:
Thread3.c
#include <pthread.h>#include <stdlib.h>#include <unistd.h>#include <stdio.h>int myglobal;pthread_mutex_t mymutex=PTHREAD_MUTEX_INITIALIZER; void *thread_function(void *arg) { int i,j; for ( i=0; i<20; i++) { pthread_mutex_lock(&mymutex); j=myglobal; j=j+1; printf("."); fflush(stdout); sleep(1); myglobal=j; pthread_mutex_unlock(&mymutex); } return NULL;}int main(void) { pthread_t mythread; int i; if ( pthread_create( &mythread, NULL, thread_function, NULL) ) { printf("error creating thread."); abort(); } for ( i=0; i<20; i++) { pthread_mutex_lock(&mymutex); myglobal=myglobal+1; pthread_mutex_unlock(&mymutex); printf("o"); fflush(stdout); sleep(1); } if ( pthread_join ( mythread, NULL ) ) { printf("error joining thread."); abort(); } printf("/nmyglobal equals %d/n",myglobal); exit(0);} |
Explanation
If you compare this code with the version given in the previous article, you will notice that the pthread_mutex_lock () and pthread_mutex_unlock () function calls are added. These calls in a thread program execute indispensable functions. They provideMutual ExclusionMethod (the mutex object is named accordingly ). Two threads cannot lock the same mutex object at the same time.
This is how mutex objects work. If thread a tries to lock a mutex object, and thread B has locked the same mutex object, thread a enters the sleep state. Once thread B releases the mutex object (called through pthread_mutex_unlock (), thread a can lock the mutex object (in other words, thread a will return the mutex object from the call of the pthread_mutex_lock () function, at the same time, the mutex object is locked ). Similarly, when thread a is locking the mutex object, if thread C tries to lock the mutex object, thread C will temporarily enter the sleep state. All threads that call pthread_mutex_lock () on the locked mutex object will enter the sleep state. These sleeping threads will "queue" to access this mutex object.
Pthread_mutex_lock () and pthread_mutex_unlock () are usually used to protect the data structure. This means that, by locking and unlocking a thread, only one thread can access a Data Structure at a certain time point. It can be inferred that when the thread attempts to lock an unlocked mutex object, the POSIX thread library will agree to lock the object without putting the thread into sleep state.
Take a look at this simple cartoon. The four elves have reproduced a scene of the last pthread_mutex_lock () call.
As shown in the figure, threads that lock mutex objects can access complex data structures without worrying about other thread interference at the same time. The data structure is actually "Frozen" until the mutex object is unlocked. Call the pthread_mutex_lock () and pthread_mutex_unlock () functions to enclose a specific shared data that is being modified and read, just like the "under construction" sign. These two functions warn other threads to continue sleeping and wait for their turn to lock the mutex objects. UnlessEachThis problem occurs only when pthread_mutex_lock () and pthread_mutext_unlock () are put before and after the read/write operations on a specific data structure.
Why use mutex?
It sounds interesting, but why do we need to sleep the thread? Do you know that the main advantage of a thread is that it has the ability to work independently and more often at the same time? Yes, it is. However, each important thread program must use some mutex objects. Let's take a look at the sample program to understand the cause.
Please refer to thread_function (). The mutex object is locked at the beginning of the loop before it is unlocked. In this example, mymutex is used to protect the value of myglobal. Check thread_function () carefully, add a code to copy myglobal to a local variable, add one to the local variable, and sleep for one second. Then, the local variable value is sent back to myglobal. When mutex objects are not used, even if the main thread adds one to myglobal during a second of sleep in the thread_function () thread, thread_function () will overwrite the value added by the main thread after waking up. The use of mutex can ensure that this situation does not occur. (You may think that I have increased the Latency by one second to trigger incorrect results. Before assigning the value of a local variable to myglobal, there is actually no real reason to require thread_function () to sleep for one second .) New programs that use mutex objects produce the expected results:
$ ./thread3o..o..o.o..o..o.o.o.o.o..o..o..o.ooooooomyglobal equals 40 |
To further explore this very important concept, let's take a look at the code for the add-on operation in the program:
Thread_function () plus one code: J = myglobal; j = J + 1; printf (". "); fflush (stdout); sleep (1); myglobal = J; Add code to the main thread: myglobal = myglobal + 1; |
If the code is in a single-threaded program, it is expected that the thread_function () code will be fully executed. Then the main thread code (or in reverse order) will be executed ). In a thread program that does not use mutex objects, the Code may (almost because sleep () is called) be executed in the following order:
Thread_function () thread main thread J = myglobal; j = J + 1; printf (". "); fflush (stdout); sleep (1); myglobal = myglobal + 1; myglobal = J; |
When the code is executed in this specific order, the modification to myglobal by the main thread will be overwritten. After the program ends, it will get an incorrect value. If the pointer is being manipulated, a segment error may occur. It is noted that the thread_function () thread executes all its commands in order. It does not seem that the order of thread_function () is reversed. The problem is that another thread modifies the same data structure at the same time.
Thread insider 1
Before explaining how to determine where to use mutex, let's take a deeper look at the internal working mechanism of the thread. See the First example:
Assume that the main thread will create three new threads: thread a, thread B, and thread C. Assume that thread a is created first, thread B is created, and thread C is created.
pthread_create( &thread_a, NULL, thread_function, NULL); pthread_create( &thread_b, NULL, thread_function, NULL); pthread_create( &thread_c, NULL, thread_function, NULL); |
After the first pthread_create () call is complete, it can be assumed that thread a is either existing or ended and stopped. After the second pthread_create () call, both the main thread and thread B can assume that thread a exists (or has stopped ).
However, after the second create () call returns, the main thread cannot assume which thread (A or B) runs first. Although both threads already exist, the distribution of CPU time slices of threads depends on the kernel and the thread library. There are no strict rules for who will first run. Although thread a is more likely to start execution before thread B, this is not guaranteed. This is especially true for multi-processor systems. If the written code assumes that the code of thread a is actually executed before thread B starts to execute, the probability of the program running correctly is 99%. Or worse, the program runs correctly 100% on your machine, but the probability of running correctly on your customer's quad-processor server is zero.
The example also shows that the thread library retains the code execution sequence of each separate thread. In other words, the three pthread_create () calls are actually executed in the order they appear. From the perspective of the main thread, all code is executed in sequence. Sometimes, this can be used to optimize some thread programs. For example, in the preceding example, thread C can assume that thread a and thread B are either running or terminated. It does not have to worry about the possibility of creating threads A and B. This logic can be used to optimize the thread program.
Thread insider 2
Now let's look at another hypothetical example. Assume that many threads are executing the following code:
So, do you need to lock and unlock mutex objects before and after the first operation? Some may say "no ". The compiler is very likely to compile the above assignment statement into a machine instruction. As we all know, it is impossible to "halfway" interrupt a machine command. Even hardware interruptions do not disrupt the integrity of machine commands. Based on the above considerations, it is likely that pthread_mutex_lock () and pthread_mutex_unlock () calls are omitted completely. Do not do this.
Am I talking nonsense? Not exactly. First, it should not be assumed that the above assignment statement will be compiled into a machine command unless the machine code is verified in person. Even if you insert some Embedded Assembly statements to ensure the complete execution of the add-on operation-or even write the compiler by yourself! -- Problems may still exist.
The answer is here. Using a single embedded assembly operation code on a single processor system may not cause any problems. Each plus one operation will be complete and most of the results will be expected. However, the multi-processor system is completely different. On multiple CPU machines, two separate Processors may execute the above assignment statement at almost the same time point (or at the same time point. Do not forget. In this case, the memory modification must first be written from L1 to L2 high-speed cache before being written to primary memory. (SMP machines are not just processors added; they also have special hardware used to handle Ram access .) In the end, it is impossible to find out which CPU will "win" in the competition for writing primary memory ". To generate predictable code, use mutex objects. A mutex will be inserted into a "Memory Level" to ensure that the write to the primary memory is performed in the order in which the thread locks the mutex object.
Consider an SMP architecture that updates the primary memory in 32-bit blocks. If you do not use a mutex, add one to a 64-bit integer. Up to four bytes of the integer may come from one CPU, while the other four bytes may come from another CPU. Bad! Worst of all, with poor technology, your program may not crash for a long time on the system of an important customer, that is, it crashes at three o'clock AM. In his POSIX thread programming book (see references at the end of this Article), David R. butenhof discusses the situations that will arise because mutex objects are not used.
Many mutex objects
If too many mutex objects are placed, the Code has no concurrency, and the running process is slower than the single-thread solution. If too few mutex objects are placed, the Code may encounter strange and embarrassing errors. Fortunately, there is an intermediate position. First, the mutex object is used for serializing access to * shared data *. Do not use mutex objects for non-shared data. If the program logic ensures that only one thread can access a specific data structure at any time, do not use mutex objects.
Second, if you want to use shared data, you should use mutex when reading and writing shared data. Use pthread_mutex_lock () and pthread_mutex_unlock () to protect the read/write part, or use them randomly in an unfixed place in the program. Learn to examine the code from the perspective of a thread and ensure that each thread in the program has a consistent and appropriate idea of memory. To get familiar with the usage of mutex objects, it may take several hours to write code at first, but soon it will get used to it and ** you don't have to think about it to use them correctly.
Call: initialization
Now let's take a look at the various methods for using mutex objects. Let's start with initialization. In the thread3.c example, we use the static initialization method. This requires the declaration of a pthread_mutex_t variable and the constant pthread_mutex_initializer:
pthread_mutex_t mymutex=PTHREAD_MUTEX_INITIALIZER; |
It's easy. However, you can also dynamically create mutex objects. This dynamic method is used when the Code uses malloc () to allocate a new mutex object. In this case, the static initialization method does not work, and the routine pthread_mutex_init () should be used ():
int pthread_mutex_init( pthread_mutex_t *mymutex, const pthread_mutexattr_t *attr) |
As shown in, pthread_mutex_init accepts a pointer as a parameter to initialize as a mutex object, which points to an allocated memory area. The second parameter can accept an optional pthread_mutexattr_t pointer. This structure can be used to set various mutex object attributes. However, these attributes are usually not required, so the normal practice is to specify NULL.
Once a mutex object is initialized using pthread_mutex_init (), use pthread_mutex_destroy () to remove it. Pthread_mutex_destroy () accepts a pointer to pthread_mutext_t as a parameter, and releases any resources assigned to it when a mutex object is created. Note that pthread_mutex_destroy ()NoRelease the memory used to store pthread_mutex_t. Releasing your memory is entirely dependent on you. Note that, when pthread_mutex_init () and pthread_mutex_destroy () are successful, zero is returned.
Call: Lock
pthread_mutex_lock(pthread_mutex_t *mutex) |
Pthread_mutex_lock () accepts a pointer to a mutex object as a parameter to lock it. If a mutex object is locked, the caller enters the sleep state. When the function returns, it will wake up the caller (apparently) and the caller will keep the lock. If the function is successfully called, zero is returned. If the function fails, a non-zero error code is returned.
pthread_mutex_unlock(pthread_mutex_t *mutex) |
Pthread_mutex_unlock () works with pthread_mutex_lock () to unlock mutex objects that have been locked by the thread. Always unlock mutex objects that have been locked as soon as possible (to improve performance ). Do not unlock mutex objects that you have not kept locked (otherwise, the call to pthread_mutex_unlock () will fail and bring a non-zero eperm return value ).
pthread_mutex_trylock(pthread_mutex_t *mutex) |
When the thread is doing other things (because the mutex object is currently locked), this call is quite convenient if you want to lock the mutex object. When pthread_mutex_trylock () is called, The system attempts to lock the mutex object. If the mutex object is currently unlocked, you will obtain the lock and the function will return zero. However, if the mutex object is locked, this call will not be blocked. Of course, it will return a non-zero ebusy error value. Then you can continue to do other things and try to lock it later.
Wait condition occurs
Mutex objects are necessary tools for thread programs, but they are not omnipotent. For example, if a thread is waiting for a condition in the shared data to appear, what will happen? The code can lock and unlock mutex objects repeatedly to check for any changes in the value. At the same time, you must quickly unlock the mutex so that other threads can make any necessary changes. This is a terrible method, because threads need to repeatedly detect changes within a reasonable time range.
During each check, the calling thread can enter sleep for a short time, for example, sleeping for three seconds, but the thread Code cannot respond as quickly as possible. What is really needed is that the thread enters the sleep state when the thread is waiting to meet certain conditions. Once the conditions are met, a method is also required to wake up the thread that is waiting for the specified conditions to be met. If this can be done, the thread code will be very efficient and will not occupy valuable mutex object locks. This is exactly what POSIX condition variables can do!
POSIX conditional variables will be the topic of the next article, which describes how to correctly use conditional variables. By that time, you will have all the resources required to create complex thread programs that can simulate staff, assembly lines, and so on.