Introduction:Thread problems are a headache for many programmers. UNIX process models are easy to understand, but sometimes inefficient. Thread Technology usually improves the performance substantially. The price is that the Code is a bit messy. This article unveil the mystery of POSIX thread interfaces and provides practical examples of thread-based code as a reference.
Apart from Anne McCaffrey's series of NovelsDragonriders of PernIn addition, "Thread" is the word that makes programmers talk about things. A thread is sometimes called a lightweight process and is related to large and complex projects. When calling a library function, you will often encounter some terrible warnings about "thread insecurity. But what are these threads? What can I do with them? What are the risks of using threads?
This article introduces the thread through a simple thread application. The thread model used is a POSIX thread interface, usually called pthreads. This example is based on SuSE Linux 8.2. All code is tested on the latest build of SuSE Linux 8.2 and NetBSD-current.
What is a thread?
The thread and process are very similar. The difference is that the thread is smaller than the process. First, threads use the design idea that multiple threads can share resources. For example, most of their operations are performed in the same address space. Second, switching from one thread to another costs less than the process. Again, the information of the process itself occupies more space in the memory than the thread, so the thread can use the memory more efficiently.
Threads usually need to interact with each other, so there is a problem of using IPC for multi-process communication. This article does not discuss the problem of multi-process communication too much, because the poxis thread API provides tools to handle problems such as deadlocks and race conditions. This article mainly discusses the problems and solutions specific to multi-thread programming. The general multi-channel programming problems will be discussed later.
Thread programs sometimes encounter problems that are not frequently encountered in multi-process and IPC programming. For example, if two threads call a function at the same time, suchasctime()
(It uses a static data zone) and produces incredible results. This is a question to be considered for "thread security.
Back to Top
A simple program
The sample program used in this article is a dice program. People often spend a lot of time playing role-playing or war games on dice. A dice program that can be accessed through the network is suitable for example programs in many aspects. The program code is very simple and the most important reason is that the understanding of the program logic will not affect the understanding of the thread.
The most distracting part is the network part. to simplify our learning process as much as possible, this part of code is encapsulated in some subroutines. First subroutinesocket_setup()
It returns a socket to be accepted. The second subroutineget_sockets()
, Takes this socket as the parameter, accepts the connection, and createsstruct sockets
Object.
struct sockets
The object is just an abstract description of the input/output ports. OneFILE *
Used for input, the other for output, and a flag to remind us to close the socket correctly in the future.
You can download different versions of the program from references in tar compression format, view or run these programs in a separate shell, or view them in a separate browser window. The first version is dthread1.c.
Let's analyze this program carefully.dthread1
. It has several options. The first option (not used currently, reserved for function expansion) is a debugging flag, with the option-d
. The second option is-s
It determines whether the program runs in the Console environment. If no-s
The program will listen to the connection and session through the connection. Third option-t
Check whether multiple threads are running. If no-t
Option. The program processes only one connection and then exits. The last one is-S
Option, which causes the program to pause for one second between every two throws. This option is just for fun, because we can easily see the Alternate Process between multiple connections.
Kernel thread and user thread
In some operating systems,dthread1
Unexpected results may occur because it blocks the running of the entire program while trying to read the socket, not just the running of a single thread. This is a difference between the kernel thread and the user thread.
A user thread is a fine-grained software tool that allows multi-threaded programs to run without special Kernel support. However, this is also a disadvantage. If a thread calls the system and blocks it, the whole process will be blocked. While the kernel thread allows a thread to be blocked, other threads can run normally.
Posix api does not limit how threads should work, so there is a lot of room for compiling thread programs. The latest system will have kernel threads. This is too simple. If you are interested in in-depth understanding, you can read the source code of the pthreads library of your system.
Let's take a good look at this program. To connect the program to it as soon as it is run, trytelnet localhost 6173
. If you are not familiar with these games, pleaseStart 2d6
. The general syntax supported is "ndx", which means to cast n dice. The range of each dice can be from 1 to X (as can be seen from the code, the program is actually throwing an X-sided dice n times ).
Here we can see the simplest form of the thread program. Note that each thread only processes its own unique data. (This is not entirely correct. If you find this error, it means you are very observing ). This avoids overlapping threads. To this end,pthread_create()
Will acceptvoid *
Type parameter, which will be passed to the function that the thread starts to execute. This allows you to create any complex data structure and send it as a pointer to the thread that needs to operate on the data structure. When its address is passed inpthread_create()
After the function is completed, the thread ends, and other threads continue to run.
An obvious problem for multi-threaded programs is how to completely terminate the program (we can also simply useCtrl-C
). For a single-threaded program, we can easily know how to terminate it: when the user exits, the program exits. But when should I exit a program that connects four users? The answer is obviously "after the last user exits ". But how can I determine that the last user has exited? One solution is to add a variable. Each time a new thread is created, the variable is incremented by 1. Each time a thread is terminated, the variable is reduced by 1. When the variable is set to zero again, to close the entire process.
This method sounds good, but it also has the risk of program crash.
Back to Top
Race Condition and mutex
Imagine what would happen if another thread is being created while exiting? If the thread scheduler happens to switch between them, the program will be inexplicably closed. Thread 1 is being executedi = i + 1;
For such code, thread 2 is executingi = i - 1;
Such code. For convenience of discussion, assume that the initial value of variable I is 2.
Thread 1: extract the value of I (2 ). Thread 1: I + 1 (result 3 ). Thread 2: extract the value of I (2 ). Thread 2: Reduce the value of I by 1 (the result is 1 ). Thread 2: Save the result to I (I = 1 ). Thread 1: Save the result to I (I = 3 ). |
Ah!
The Race Condition condition is a condition with a very low probability of error. It means that you will encounter this situation only when you are fast or lucky. Competing conditions are rarely encountered in millions of operations, so it is difficult to debug them.
We need to take some methods to avoid the above situation in thread 1 and thread 2; these methods must ensure that thread 1 "isI". You can make many interesting attempts to find a suitable method. This method must ensure that the two threads do not conflict. You can use the existing mechanism to compile your own code and try it to enjoy more fun.
The next concept to be understood isMutex). Mutex is a method to avoid overlap between threads. You can think of it as a unique object. You must add it to favorites, but you can possess it only when no one else occupies it, no one can take possession of it before you take the initiative to give up it. The process of occupying this unique object is called locking or obtaining mutex. Different people learn different names for this issue, so when you talk about it with others, others may use different words to express it. POSIX thread interfaces follow similar terms, so they are called "locked ".
The process of creating and using mutex is a little more complex than just starting a thread. The mutex object must be declared first and then initialized. After completing these steps, you can be locked and unlocked.
Back to Top
Mutex code: first example
View dthread2.c in the browser (or the expanded PTH directory.
For conveniencepthread_create()
Is put into a new calledspawn()
. In this way, you only need to modify the mutex code in a sub-program. This routine has done some new work; it lockscount_mutex
. Then it creates a new thread and increments it at the same time.threadcount
; After it is completed, it unlocks the mutex. When a thread is about to terminate, it will lock the mutex again and reduce the amount.threadcount
And then unlock the mutex. If the mutexthreadcount
The value is reduced to zero. We know that no thread is running and the program is exited. This statement is not completely correct. At this time, there is still a thread running, which is the thread created during the initial running of the process.
You may notice thatpthread_mutex_init()
. This function initializes the mutex runtime environment. This process is required for the normal operation of mutex. You can callpthread_mutex_destroy()
To release the resources allocated during initialization. Some implementations do not allocate or release any resources; they only use APIs. If you do not make these calls, there is always a moment you do not want to appear, A production system will use a new implementation that relies on these calls-you will find problems with it at three o'clock AM.
Inspawn()
Keep changing the code for locking mutexthreadcount
It seems reasonable. It sounds good. Unfortunately, this will introduce a frustrating bug. This type of program will output a prompt very frequently and then wait. Does it sound confusing? Remember, oncepthread_create()
Called, and the new thread starts to execute. Therefore, the event sequence looks as follows:
Main thread: Call pthread_create ().
New sub-thread: Output prompt information.
Old sub-thread: exit and reduce threadcount to 0.
Main thread: Lock mutex and increment threadcount.
Of course, the last step in this order will never be executed.
If you callpthread_create()
Previous changesthreadcount
If the code of the value is removed, only the statement for reducing the subtotal value is left in the mutex code. The sample program is not written in this way, just to increase the fun of the example.
Back to Top
Unlock deadlocks
This article has previously mentioned a subtle potential race condition in the original program. Now, the answer is exposed. This difficult-to-find race condition isrand()
An internal status exists. If it is called when two threads overlaprand()
It may return an incorrect random number. For this program, this may not be a big problem, but for a formal simulation program based on the reproduction of random numbers, this is a big problem.
So, let's go further and add a mutex where random numbers are generated. In this way, we can easily solve this competitive condition problem.
Browse dthread3.c or open this file in the PTH directory.
Unfortunately, there is still another potential problem called deadlock. A deadlock occurs when two or more threads wait for each other. Imagine there are two mutex here, which we call respectivelycount_mutex
Andrand_mutex
. Now, two threads need to use the two mutex. The activity of thread 1 is as follows:
mutex_lock(&count_mutex); mutex_lock(&rand_mutex); mutex_unlock(&rand_mutex); mutex_unlock(&count_mutex); |
Thread 2 executes these statements in another order:
mutex_lock(&rand_mutex); mutex_lock(&count_mutex); mutex_unlock(&count_mutex); mutex_unlock(&rand_mutex); |
A deadlock wait occurs. If the two threads start execution at the same time, they start to execute the lock at the same time:
Thread 1: mutex_lock(&count_mutex); Thread 2: mutex_lock(&rand_mutex); |
What will happen next? If thread 1 is to run, it will be lockedrand_mutex
But the mutex has been blocked by thread 2. If thread 2 is to run, it will be lockedcount_mutex
The mutex has been occupied by thread 1. This situation can be described by reference to a fabricated Texas regulation: "When two trains encounter at a crossroads, they both stop moving forward and wait until the opposite side leaves ".
A simple solution to such a problem is to ensure that the mutex is obtained in the same order. Similarly, a simple way to secure each train is to maintain control over the train. In the actual program, it is unlikely that calls that cause deadlocks are neatly queued, even in our simple example Program (if the call order is properly arranged, can avoid deadlocks), and calls to mutex are not completely adjacent.
Now, focus on the next example, dthread4.
This version of the program demonstrates a general source of deadlock: programmer error.
This program allows multiple specifications on one line, and also limits the dice to be common and common multi-faceted dice in role games. The developer of this bad program had a good idea at first, that is, they only locked the mutex before they really needed to use it, but they did not unlock it until the operation was over. Therefore, if you enter "2d6 2d8", the program will lock the six-sided dice, throw twice, then lock the eight-sided dice, and throw twice. All the dice will be unlocked only when all throwing ends.
Unlike earlier versions, this version can easily enter the deadlock status. If you want to, imagine two users simultaneously requesting the dice, one requesting "2d6 2d8" and the other requesting "2d8 2d6 ". What will happen?
Thread 1: Lock six dice and throw. Thread 2: Lock the eight-sided dice and throw. Thread 1: attempts to lock the eight-sided dice. Thread 2: attempts to lock six dice. |
The "smart" solution is actually no solution at all. If a person who casts a dice immediately releases the dice that he owns after throwing the dice, this problem will not occur.
The first lesson we learned from this is that you can go deep into the simulated situation. Frankly speaking, it's silly to lock a single dice. However, the second lesson is that it is impossible to see deadlocks in the code. What mutex should be locked depends only on the data available at runtime, which is the key to the problem.
If you want to actually understand this bug, try to use-S
In this way, you have enough time to switch between different terminals and observe the program running. Now, how do you plan to correct it? For the convenience of the discussion, it is assumed that it is necessary to lock a single dice. How did you do it? Dthread5.c is a naive solution. Here, you can return to the front and add a random number generator to each dice. Those good players understand that the results of the dice are good or bad, and you will not waste the good points, right?
Deadlocks also occur in separate threads. The default mutex is fast. This mutex has a great advantage. If you try to lock a locked mutex, a deadlock will occur. If possible, never lock a locked mutex when designing a program. In addition, you can also use recursive mutex, which allows you to lock the same mutex multiple times. Another type of mutex is used to detect common errors. For example, you can unlock an unlocked mutex again. Note that recursive mutex does not help you solve the actual lock bug in the program. The following code snippet is extracted from earlier versions of dthread3.c:
Listing 1. Taken from the code snippet of dthread3.c, which contains a bug
int roll_die(int n) { pthread_mutex_lock(&rand_mutex); return rand() % n + 1; pthread_mutex_unlock(&rand_mutex); } |
See if you can discover bugs faster than me (I spent about 5 minutes ). To be trustworthy, try to start checking the error at AM. You canSidebar.
The comprehensive discussion of deadlocks is beyond the scope of this article, but now you know what information to query, in addition, you can find more information about mutex and deadlock in the reference list of the reference section.
Back to Top
Condition variable
A condition variable is another interesting variable. A condition variable allows a thread to be blocked when the condition is not met. When the condition is true, the thread is awakened. Functionpthread_cond_wait()
It is mainly used to block a thread. It has two parameters. The first is a pointer to a condition variable, and the second is a locked mutex. The condition variable must be initialized using an API call like the mutex. This API call ispthread_cond_init()
. When condition variables are no longer used, callpthread_cond_destroy()
Release the resources it allocates during initialization. For mutex, these calls may not do anything in some implementations, but you should also call them.
Whenpthread_cond_wait()
After being called, it unlocks mutex and stops thread execution. It remains paused until other threads wake it up. These operations are "Atomic operations"; they are always executed together, and no other threads are executed between them. After they are executed, other threads start to run. If another thread calls a condition variablepthread_cond_signal()
The thread that is blocked while waiting for this condition will be awakened. If another thread calls this condition variablepthread_cond_broadcast()
, Then all threads that are blocked by waiting for this condition will be awakened.
Finally, when a thread callspthread_cond_wait()
When awakened, the first thing to do is to re-lock the mutex that it unlocked during the initial call. This operation takes a certain amount of time. In fact, this time is too long, and the value of the condition variable waiting for the thread may change during this operation. For example, a thread waiting for the goods to be added to the linked list may find that the linked list is empty when it is awakened. In some implementations, threads may occasionally wake up without sending signals to conditional variables. Thread Programming is not always accurate, so always pay attention to this point during programming.
Back to Top
More
Of course, the above content is only a bit of the POSIX thread API. Some users may find thatpthread_join()
Call, this function can make a thread in the waiting state until the execution of another thread is complete. Some attributes can be set to control scheduling. There are many references about POSIX Threads on the web. Read the user manual. You can use the commandapropos pthread
Orman -k pthread
Obtain the user manual related to pthread.
Exercise answer
The code in Listing 1 has an obvious but often overlooked bug. Have you found it?
This error occurs when the program runs after the mutex is unlocked.
References
- For more information, see the original article on the developerworks global site.
- All files used in this article can be downloaded as a tar file once or separately from the following link:
- Makefile
- Dthread1.c
- Dthread2.c
- Dthread3.c
- Dthread4.c
- Dthread5.c
- Daniel RobbinsDeveloperworksThe POSIX thread API is introduced in the POSIX thread details in the previous three-part series.
- Open Directory has a POSIX thread column listing many libraries, tutorials, and FAQs.
- Ibm aix resources POSIX thread APIs are all taken from IBM eserver solutions, which are useful materials.
- Books on AIXGeneral programming concepts: Writing and debugging programsThe content in the two chapters "Understanding threads and processes" and "multi-threaded programming" applies to thread programming on other platforms.
- GNU portable threads, or gnu pth, provides a free software implementation for POSIX-compatible threads, and the GNU software operation manual has always been very good.
- "Going to Linux 2.6" discusses kernel preemption, futexes, new scheduling programs, and more about the latest changes (Developerworks, January 1, November 2003 ).
- In "manage processes and threads "(DeveloperworksIn the article, February 2002), ED Bradford focuses on the threads and processes in Linux and Windows environments.
- For more information about POSIX Threads, including connection and scheduling threads, read the online POSIX Threads programming and little UNIX programmers group (lupg) tutorials at Lawrence Livermore National Laboratories) multi-threaded programming with POSIX Threads.
- Here is a guide to understanding mutex and lock. Although it is not explained based on POSIX Threads, the problems involved are universal.
- About
pthread_create
,pthread_join
,pthread_mutex_init
Andpthread_cond_init
The manual page is very useful. You can view them in the terminal window or find them from Linux man pages online.
- Peter tested his dice program on SuSE Linux and NetBSD.
- Not everyone is keen on threads. In a conversation with usenix in 1996, the inventor of TCL, John Ousterhout (now working at electric cloud, Inc, then at Sun), believed that using events was better than using threads, he can understand his point of view through his personal page at Bell Labs, or his slides of 1996 (in PDF format.
- Anne McCaffrey is a writer who likes to write mythical stories related to dragons on perN. There was a terrible Silver threads landing on planet Pern, panicking those who were concerned. One question to ask Anne is that, for those who come from the Earth, they must fly a certain ship over a long distance to Pern for colonial dominance-to the second book they have not inventedTelescope?
- Web dice games are very useful in role-playing games (I .e. RPG games). Have you ever played such games? What is role-playing through Paul Elliot? Or Dave's Role Play introduction. If you are sure you understand, you can get Amulet of yendor from nethack.