Transferred from: https://www.ibm.com/developerworks/cn/linux/l-cn-deadlock/
Brief introduction
Deadlock (Deallocks): Refers to two or more than two processes (threads) in the course of execution, because of the contention for resources caused by a mutual waiting phenomenon, if there is no external force, they will not be able to proceed. At this point, the system is in a deadlock state or the system generates a deadlock, and these processes (threads) that are always waiting on each other are called deadlock processes (threads). Because resource consumption is mutually exclusive, when a process requests resources, so that the process (threads) with no external assistance, will never be allocated the necessary resources and can not continue to run, which creates a special phenomenon of deadlock.
A cross-lock deadlock scenario in which two or more threads in the execution program are permanently blocked (waiting), and each thread waits for resources that are consumed by other threads and blocked. For example, if thread 1 locks record A and waits for record B, and thread 2 locks record B and waits for record A, a deadlock occurs on two threads. In computer systems, if the system's resource allocation policy is inappropriate, it may be more common that programmers write programs with errors and so on, which can lead to a process due to improper competition resources to create a deadlock phenomenon.
Four necessary conditions for generating a deadlock
(1) Mutex condition: A resource can only be used by one process (thread) at a time.
(2) Request and hold condition: When a process (thread) is blocked by a request for resources, the acquired resources are persisted.
(3) No deprivation condition: the resources that this process (thread) has obtained cannot be forcibly stripped until the end of use.
(4) Cyclic wait condition: a cyclic waiting resource relationship between multiple processes (threads) is formed.
Figure 1. Deadlock with cross-lock:
Note: After executing func2 and FUNC4, child thread 1 obtains lock A, is attempting to acquire lock B, but child thread 2 obtains lock B at this time, is attempting to acquire lock a, so child thread 1 and child thread 2 will have no way to get lock A and lock B, because they are each possessed by each other, never released, So there is a deadlock phenomenon.
Profiling deadlock programs using the Pstack and GDB tools Pstack A brief introduction to the Linux platform
Pstack is a useful tool for Linux (such as Red Hat Linux systems, Ubuntu Linux systems, etc.), and its function is to print out the stack information for this process. You can output the call stack for all threads.
A brief introduction to GDB on the Linux platform
GDB is a powerful UNIX program debugging tool released by the GNU Open source organization. The Linux system contains the GNU debug program GDB, a debugger for debugging C and C + + programs. You can enable program developers to observe the internal structure and memory usage of the program while the program is running.
Some of the main features that GDB provides are as follows:
1 run the program, set the parameters and environment that can affect the operation of the program;
2 The control program stops running under the specified conditions;
3 when the program stops, you can check the status of the program;
4 when the program crash, you can check the core file;
5 can modify the program error, and re-run the program;
6 You can dynamically monitor the value of variables in the program;
7 You can step through the code to observe the running state of the program.
GDB program debugging objects are executable files or processes, not the program's source code files. However, not all executable files can be debugged with GDB. If you want the resulting executable file to be used for debugging, you need to add the-G parameter when executing the g++ (GCC) Directive compiler, specifying that the program contains debugging information at compile time. The debug information contains the type of each variable in the program and the address mapping in the executable file and the line number of the source code. GDB uses this information to correlate the source code with the machine code. GDB has a lot of basic commands, not a detailed introduction, if you need further information, please refer to the GDB manual.
Listing 1. Test program
#include <unistd.h> #include <pthread.h> #include <string.h> pthread_mutex_t mutex1 = pthread_mutex_ INITIALIZER; pthread_mutex_t mutex2 = Pthread_mutex_initializer; pthread_mutex_t mutex3 = Pthread_mutex_initializer; pthread_mutex_t mutex4 = Pthread_mutex_initializer; static int sequence1 = 0; static int sequence2 = 0; int func1 () {pthread_mutex_lock (&MUTEX1); ++sequence1; Sleep (1); Pthread_mutex_lock (&MUTEX2); ++sequence2; Pthread_mutex_unlock (&MUTEX2); Pthread_mutex_unlock (&MUTEX1); return Sequence1; } int Func2 () {pthread_mutex_lock (&MUTEX2); ++sequence2; Sleep (1); Pthread_mutex_lock (&MUTEX1); ++sequence1; Pthread_mutex_unlock (&MUTEX1); Pthread_mutex_unlock (&MUTEX2); return sequence2; } void* Thread1 (void* Arg) {while (1) {int iretvalue = func1 (); if (Iretvalue = = 100000) {pthread_exit (NULL); }}} void* thread2 (void* Arg) { while (1) {int iretvalue = FUNC2 (); if (Iretvalue = = 100000) {pthread_exit (NULL); }}} void* thread3 (void* Arg) {while (1) {sleep (1); Char szbuf[128]; memset (szbuf, 0, sizeof (SZBUF)); strcpy (Szbuf, "thread3"); }} void* Thread4 (void* Arg) {while (1) {sleep (1); Char szbuf[128]; memset (szbuf, 0, sizeof (SZBUF)); strcpy (Szbuf, "thread3"); }} int main () {pthread_t tid[4]; if (Pthread_create (&tid[0], NULL, &THREAD1, NULL)! = 0) {_exit (1); } if (Pthread_create (&tid[1], NULL, &THREAD2, NULL)! = 0) {_exit (1); } if (Pthread_create (&tid[2], NULL, &THREAD3, NULL)! = 0) {_exit (1); } if (Pthread_create (&tid[3], NULL, &THREAD4, NULL)! = 0) {_exit (1); } sleep (5); Pthread_cancel (Tid[0]); Pthread_join (Tid[0], NULL); Pthread_join (tid[1], NULL); Pthread_join (tid[2], NULL); Pthread_join (Tid[3], NULL); Pthread_mutex_destroy (&MUTEX1); Pthread_mutex_destroy (&MUTEX2); Pthread_mutex_destroy (&MUTEX3); Pthread_mutex_destroy (&MUTEX4); return 0; }
Listing 2. Compiling the test program
[Email protected] purify]$ g++-G lock.cpp-o lock-lpthread
Listing 3. Find the process number of the test program
[Email protected] purify]$ ps-ef|grep lock Dyu 6721 5751 0 15:21 pts/3 00:00:00./lock
Listing 4. The output of the Pstack (pstack– process number) is executed for the first time to the deadlock process
[[email protected] purify]$ pstack 6721 thread 5 (thread 0x41e37940 (LWP 6722)): #0 0x0000003d1a80d4c4 in __lll_lock _wait () from/lib64/libpthread.so.0 #1 0x0000003d1a808e1a in _l_lock_1034 () from/lib64/libpthread.so.0 #2 0x0000003d1 A808CDC in Pthread_mutex_lock () from/lib64/libpthread.so.0 #3 0x0000000000400a9b in Func1 () () #4 0x0000000000400ad7 I n Thread1 (void*) () #5 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #6 0x0000003d19cd40cd in clone () from/lib64/libc.so.6 thread 4 (thread 0x42838940 (LWP 6723)): #0 0x0000003d1a80d4c4 in __lll_lock_wait () from/lib64 /libpthread.so.0 #1 0x0000003d1a808e1a in _l_lock_1034 () from/lib64/libpthread.so.0 #2 0X0000003D1A808CDC in Pthread_m Utex_lock () from/lib64/libpthread.so.0 #3 0x0000000000400a17 in Func2 () () #4 0x0000000000400a53 in Thread2 (void*) () #5 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #6 0x0000003d19cd40cd in Clone () from/lib64/libc. so.6 thread 3 (thread 0x43239940 (LWP 6724)): #0 0x0000003d19c9a541 in Nanosleep () from/lib64/libc.so.6 #1 0x0000003d19c9a364 in Sleep () from /lib64/libc.so.6 #2 0X00000000004009BC in Thread3 (void*) () #3 0x0000003d1a80673d in Start_thread () from/lib64/libpth read.so.0 #4 0x0000003d19cd40cd in Clone () from/lib64/libc.so.6 thread 2 (thread 0x43c3a940 (LWP 6725)): #0 0x0000003d 19c9a541 in Nanosleep () from/lib64/libc.so.6 #1 0x0000003d19c9a364 in Sleep () from/lib64/libc.so.6 #2 0x000000000040 0976 in Thread4 (void*) () #3 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #4 0x0000003d19cd40cd in Clone () from/lib64/libc.so.6 thread 1 (thread 0x2b984ecabd90 (LWP 6721)): #0 0x0000003d1a807b35 in Pthread_join () from /lib64/libpthread.so.0 #1 0x0000000000400900 in Main ()
Listing 5. The output of the Pstack (pstack– process number) is executed for the second time to the deadlock process
[[email protected] purify]$ pstack 6721 thread 5 (thread 0x40bd6940 (LWP 6722)): #0 0x0000003d1a80d4c4 in __lll_lock _wait () from/lib64/libpthread.so.0 #1 0x0000003d1a808e1a in _l_lock_1034 () from/lib64/libpthread.so.0 #2 0x0000003d1 A808CDC in Pthread_mutex_lock () from/lib64/libpthread.so.0 #3 0x0000000000400a87 in Func1 () () #4 0X0000000000400AC3 I n Thread1 (void*) () #5 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #6 0x0000003d19cd40cd in clone () from/lib64/libc.so.6 thread 4 (thread 0x415d7940 (LWP 6723)): #0 0x0000003d1a80d4c4 in __lll_lock_wait () from/lib64 /libpthread.so.0 #1 0x0000003d1a808e1a in _l_lock_1034 () from/lib64/libpthread.so.0 #2 0X0000003D1A808CDC in Pthread_m Utex_lock () from/lib64/libpthread.so.0 #3 0x0000000000400a03 in Func2 () () #4 0x0000000000400a3f in Thread2 (void*) () #5 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #6 0x0000003d19cd40cd in Clone () from/lib64/libc. so.6 thread 3 (thread 0x41fd8940 (LWP 6724)): #0 0x0000003d19c7aec2 in memset () from/lib64/libc.so.6 #1 0x00000000004009be in Thread3 (void*) ( ) #2 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #3 0x0000003d19cd40cd in Clone () From/lib64/lib c.so.6 thread 2 (thread 0x429d9940 (LWP 6725)): #0 0x0000003d19c7ae0d in memset () from/lib64/libc.so.6 #1 0x0000000000 400982 in Thread4 (void*) () #2 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #3 0X0000003D19CD40CD I n Clone () from/lib64/libc.so.6 thread 1 (thread 0x2af906fd9d90 (LWP 6721)): #0 0x0000003d1a807b35 in Pthread_join () fr om/lib64/libpthread.so.0 #1 0x0000000000400900 in Main ()
The function call relationship stack for this process is reviewed several times: when the process hangs, multiple times using Pstack to view the process's function call stack, the deadlock thread will remain in the state of the lock, and the output of the function call stack is compared multiple times. Determine which two threads (or several threads) have not changed and remain in the same lock state (there may be two threads that have not changed).
Output Analysis:
Based on the above output comparison, it can be found that thread 1 and thread 2 are in the Memset function that changes from the sleep function to the second Pstack output by the first pstack output. However, thread 4 and thread 5 have been in the same lock state (Pthread_mutex_lock), and there has been no change in the Pstack information output for two consecutive times, so we can speculate that thread 4 and thread 5 have a deadlock.
Gdb into thread
输出:
Listing 6. Then through the GDB attach to the deadlock process
(GDB) Info thread 5 thread 0x41e37940 (LWP 6722) 0x0000003d1a80d4c4 in __lll_lock_wait () from/lib64/ libpthread.so.0 4 Thread 0x42838940 (LWP 6723) 0x0000003d1a80d4c4 in __lll_lock_wait () from/lib64/ libpthread.so.0 3 Thread 0x43239940 (LWP 6724) 0x0000003d19c9a541 in Nanosleep () from/lib64/libc.so.6 2 Thread 0x43c3a940 (LWP 6725) 0x0000003d19c9a541 in Nanosleep () from/lib64/libc.so.6 * 1 Thread 0x2b984ecabd90 (LWP 6 721) 0x0000003d1a807b35 in Pthread_join () from/lib64/libpthread.so.0
Listing 7. Switch to the output of thread 5
(GDB) thread 5 [Switching to thread 5 (thread 0x41e37940 (LWP 6722)]] #0 0x0000003d1a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 (GDB) where #0 0x0000003d1a80d4c4 in __lll_lock_wait () from/lib64/libpthread.so.0 #1 0x0000003d1a808e1a in _l_lock_1034 () from/lib64/libpthread.so.0 #2 0X0000003D1A808CDC in Pthread_mutex_lock () from/lib64/libpthread.so.0 #3 0x0000000000400a9b in func1 () @ lock.cpp:18 #4 0x0000000000400ad7 in Thread1 (AR g=0x0) at lock.cpp:43 #5 0x0000003d1a80673d in Start_thread () from/lib64/libpthread.so.0 #6 0X0000003D19CD40CD in Clone () from/lib64/libc.so.6
Listing 8. Output of thread 4 and thread 5
(GDB) F 3 #3 0x0000000000400a9b in Func1 () at Lock.cpp:18 Pthread_mutex_lock (&MUTEX2); (GDB) thread 4 [Switching to thread 4 (thread 0x42838940 (LWP 6723)]] #0 0x0000003d1a80d4c4 in __lll_lock_wait () from/li b64/libpthread.so.0 (GDB) F 3 #3 0x0000000000400a17 in Func2 () at lock.cpp:31 Pthread_mutex_lock (&mutex1 ); (GDB) p mutex1 $ = {__data = {__lock = 2, __count = 0, __owner = 6722, __nusers = 1, __kind = 0, __spins = 0, __list = {_ _prev = 0x0, __next = 0x0}, __size = "\002\000\000\000\000\000\000\000b\032\000\000\001", "\000 ' <repeats" times> , __align = 2} (GDB) p mutex3 $ = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = ' \000 ' <repeats-times>, __align = 0} (GDB) p Mutex2 $ = {__dat A = {__lock = 2, __count = 0, __owner = 6723, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0} }, __size = "\002\000\000\000\000\000\000\000c\032\000\000\001 ", ' \000 ' <repeats times>, __align = 2} (GDB)
As can be seen from the above, thread 4 is trying to get the lock mutex1, but the lock mutex1 has been made by the LWP 6722 thread (__owner = 6722), thread 5 is trying to get the lock mutex2, but the lock mutex2 has been LWP 6723 Get (__owner = 6723), from the output of pstack can be found, LWP 6722 is corresponding to thread 5, LWP 6723 and Thread 4 is the corresponding. So we can conclude that thread 4 and thread 5 have a cross-lock deadlock phenomenon. View source code discovery for threads, thread 4 and thread 5 use both Mutex1 and MUTEX2, and the order of application is unreasonable.
Summarize
This paper briefly introduces a method of analyzing deadlock problem in Linux platform, which has some effect on the analysis of some deadlock problems. Hope to be of help to everyone. Understanding the causes of deadlocks, especially the four necessary conditions that generate deadlocks, can prevent, prevent, and unlock deadlocks to the maximum possible. Therefore, in the system design, process scheduling and other aspects of how to not let these four necessary conditions to set up, how to determine the rational allocation of resources algorithm, to avoid the process of permanent occupation of system resources. In addition, to prevent the process in the waiting state to occupy resources, during the system operation, each system issued by the process can be satisfied with the resource request for dynamic check, and based on the results of the check to determine whether to allocate resources, if the system can deadlock after the allocation, it is not allocated, otherwise allocated. Therefore, the allocation of resources to reasonable planning, the use of ordered resource allocation method and Banker's algorithm is an effective way to avoid deadlock.
A simple way to analyze deadlocks on Linux