A simple method to analyze the deadlock problem in Linux

Source: Internet
Author: User
Tags sleep function

A simple method to analyze the deadlock problem in Linux

Deallocks: a phenomenon in which two or more processes (threads) Wait for each other due to resource competition during execution. If there is no external force, they will not be able to proceed. It is said that the system is in a deadlock state or the system has a deadlock. These processes (threads) that are always waiting for each other are called deadlock processes (threads ). Because resource usage is mutually exclusive, when a process requests resources, the relevant process (thread) will never be allocated with the assistance of external force and cannot continue to run, this produces a special deadlock.

In this case, two or more threads in the execution program are permanently blocked (waiting), and each thread is waiting for resources occupied and blocked by other threads. For example, if thread 1 locks record A and waits for record B, and thread 2 locks record B and waits for record A, the two threads have deadlocks. In a computer system, if the system's resource allocation policy is incorrect, it is more common that a program written by a programmer has errors, which may lead to deadlocks caused by improper competition of resources.


Four Conditions for deadlock

(1) mutex condition: A resource can only be used by one process (thread) at a time.
(2) request and retention conditions: when a process (thread) is blocked by requesting resources, it will not store the obtained resources.
(3) No deprivation condition: resources obtained by this process (thread) cannot be forcibly deprived until the end of use.
(4) Cyclic waiting condition: a kind of cyclic waiting resource relationship is formed between multiple processes (threads.

Figure 1. deadlock of cross lock:



Note: After executing func2 and func4, sub-thread 1 acquires lock A and is trying to obtain lock B. But Sub-thread 2 obtains lock B at this time and is trying to obtain lock, therefore, sub-thread 1 and sub-thread 2 will not be able to obtain the lock A and lock B, because they are occupied by each other and will never be released, so A deadlock occurs.

Use pstack and gdb tools to analyze the deadlock Program

Introduction to pstack on Linux

Pstack is a very useful tool for Linux (such as Red Hat Linux and Ubuntu Linux). Its function is to print the stack information of this process. The call Link stack of all threads can be output.

A Brief Introduction to gdb on Linux

GDB is a powerful UNIX program debugging tool released by the GNU open-source organization. Linux contains the GNU debugging program gdb, which is a debugger used to debug C and C ++ programs. The program developer can observe the internal structure and memory usage of the program while running the program.

Gdb provides the following functions:

1. Run the program and set the parameters and environment that may affect the running of the program;

2. the control program stops running under specified conditions;

3. When the program is stopped, check the program status;

4. When the program crash, you can check the core file;

5. You can modify program errors and run the program again;

6. You can dynamically monitor the values of variables in the program;

7. You can run the code in one step to observe the running status of the program.


The objects to be debugged by the gdb program are executable files or processes, rather than source code files of the program. However, not all executable files can be debugged using gdb. If you want to allow the generated executable file to be used for debugging, you must add the-g parameter when executing the g ++ (gcc) command to compile the program, specifying that the program contains debugging information during compilation. The debugging information includes the type of each variable in the program, the address ing in the executable file, and the source code line number. Gdb uses this information to associate the source code with the machine code. Gdb has many basic commands and is not described in detail. For more information, see the gdb manual.

List 1. Test Programs

 #include <unistd.h>  #include <pthread.h>  #include <string.h>  pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;  pthread_mutex_t mutex2 = PTHREAD_MUTEX_INITIALIZER;  pthread_mutex_t mutex3 = PTHREAD_MUTEX_INITIALIZER;  pthread_mutex_t mutex4 = PTHREAD_MUTEX_INITIALIZER;  static int sequence1 = 0;  static int sequence2 = 0;  int func1()  {     pthread_mutex_lock(&mutex1);     ++sequence1;     sleep(1);     pthread_mutex_lock(&mutex2);     ++sequence2;     pthread_mutex_unlock(&mutex2);     pthread_mutex_unlock(&mutex1);     return sequence1;  }  int func2()  {     pthread_mutex_lock(&mutex2);     ++sequence2;     sleep(1);     pthread_mutex_lock(&mutex1);     ++sequence1;     pthread_mutex_unlock(&mutex1);     pthread_mutex_unlock(&mutex2);     return sequence2;  }  void* thread1(void* arg)  {     while (1)     {         int iRetValue = func1();         if (iRetValue == 100000)         {             pthread_exit(NULL);         }     }  }  void* thread2(void* arg)  {     while (1)     {         int iRetValue = func2();         if (iRetValue == 100000)         {             pthread_exit(NULL);         }     }  }  void* thread3(void* arg)  {     while (1)     {         sleep(1);         char szBuf[128];         memset(szBuf, 0, sizeof(szBuf));         strcpy(szBuf, "thread3");     }  }  void* thread4(void* arg)  {     while (1)     {         sleep(1);         char szBuf[128];         memset(szBuf, 0, sizeof(szBuf));         strcpy(szBuf, "thread3");     }  }  int main()  {     pthread_t tid[4];     if (pthread_create(&tid[0], NULL, &thread1, NULL) != 0)     {         _exit(1);     }     if (pthread_create(&tid[1], NULL, &thread2, NULL) != 0)     {         _exit(1);     }     if (pthread_create(&tid[2], NULL, &thread3, NULL) != 0)     {         _exit(1);     }     if (pthread_create(&tid[3], NULL, &thread4, NULL) != 0)     {         _exit(1);     }     sleep(5);     //pthread_cancel(tid[0]);     pthread_join(tid[0], NULL);     pthread_join(tid[1], NULL);     pthread_join(tid[2], NULL);     pthread_join(tid[3], NULL);     pthread_mutex_destroy(&mutex1);     pthread_mutex_destroy(&mutex2);     pthread_mutex_destroy(&mutex3);     pthread_mutex_destroy(&mutex4);     return 0;  } 

List 2. Compile the test program

 [dyu@xilinuxbldsrv purify]$ g++ -g lock.cpp -o lock -lpthread 

Listing 3. Finding the process Number of the Test Program

 [dyu@xilinuxbldsrv purify]$ ps -ef|grep lock  dyu       6721  5751  0 15:21 pts/3    00:00:00 ./lock 

Listing 4. output results of the first execution of pstack (pstack-process number) for a deadlock Process

 [dyu@xilinuxbldsrv purify]$ pstack 6721  Thread 5 (Thread 0x41e37940 (LWP 6722)):  #0  0x0000003d1a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0  #1  0x0000003d1a808e1a in _L_lock_1034 () from /lib64/libpthread.so.0  #2  0x0000003d1a808cdc in pthread_mutex_lock () from /lib64/libpthread.so.0  #3  0x0000000000400a9b in func1() ()  #4  0x0000000000400ad7 in thread1(void*) ()  #5  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #6  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 4 (Thread 0x42838940 (LWP 6723)):  #0  0x0000003d1a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0  #1  0x0000003d1a808e1a in _L_lock_1034 () from /lib64/libpthread.so.0  #2  0x0000003d1a808cdc in pthread_mutex_lock () from /lib64/libpthread.so.0  #3  0x0000000000400a17 in func2() ()  #4  0x0000000000400a53 in thread2(void*) ()  #5  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #6  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 3 (Thread 0x43239940 (LWP 6724)):  #0  0x0000003d19c9a541 in nanosleep () from /lib64/libc.so.6  #1  0x0000003d19c9a364 in sleep () from /lib64/libc.so.6  #2  0x00000000004009bc in thread3(void*) ()  #3  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #4  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 2 (Thread 0x43c3a940 (LWP 6725)):  #0  0x0000003d19c9a541 in nanosleep () from /lib64/libc.so.6  #1  0x0000003d19c9a364 in sleep () from /lib64/libc.so.6  #2  0x0000000000400976 in thread4(void*) ()  #3  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #4  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 1 (Thread 0x2b984ecabd90 (LWP 6721)):  #0  0x0000003d1a807b35 in pthread_join () from /lib64/libpthread.so.0  #1  0x0000000000400900 in main ()    

Listing 5. output results of executing pstack (pstack-process number) for the second deadlock Process

 [dyu@xilinuxbldsrv purify]$ pstack 6721  Thread 5 (Thread 0x40bd6940 (LWP 6722)):  #0  0x0000003d1a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0  #1  0x0000003d1a808e1a in _L_lock_1034 () from /lib64/libpthread.so.0  #2  0x0000003d1a808cdc in pthread_mutex_lock () from /lib64/libpthread.so.0  #3  0x0000000000400a87 in func1() ()  #4  0x0000000000400ac3 in thread1(void*) ()  #5  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #6  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 4 (Thread 0x415d7940 (LWP 6723)):  #0  0x0000003d1a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0  #1  0x0000003d1a808e1a in _L_lock_1034 () from /lib64/libpthread.so.0  #2  0x0000003d1a808cdc in pthread_mutex_lock () from /lib64/libpthread.so.0  #3  0x0000000000400a03 in func2() ()  #4  0x0000000000400a3f in thread2(void*) ()  #5  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #6  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 3 (Thread 0x41fd8940 (LWP 6724)):  #0  0x0000003d19c7aec2 in memset () from /lib64/libc.so.6  #1  0x00000000004009be in thread3(void*) ()  #2  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #3  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 2 (Thread 0x429d9940 (LWP 6725)):  #0  0x0000003d19c7ae0d in memset () from /lib64/libc.so.6  #1  0x0000000000400982 in thread4(void*) ()  #2  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #3  0x0000003d19cd40cd in clone () from /lib64/libc.so.6  Thread 1 (Thread 0x2af906fd9d90 (LWP 6721)):  #0  0x0000003d1a807b35 in pthread_join () from /lib64/libpthread.so.0  #1  0x0000000000400900 in main () 

View the function call relation stack of the process multiple times for analysis: when the process is suspended, use pstack to view the function call stack of the process multiple times. The deadlock thread will remain in the same lock status, compare the output results of multiple function call stacks to determine which two threads (or several threads) remain unchanged and remain in the same lock state (two threads may remain unchanged ).

Output Analysis:

According to the above output comparison, we can find that thread 1 and thread 2 are changed from the sleep function output for the first time to the memset function output for the second pstack. But thread 4 and thread 5 are always in the same lock status (pthread_mutex_lock) and remain unchanged in the pstack information output for two consecutive times. Therefore, we can speculate that thread 4 and thread 5 are deadlocked.

Gdb into thread output:

Listing 6. attach the deadlock process through gdb

   (gdb) info thread   5 Thread 0x41e37940 (LWP 6722)  0x0000003d1a80d4c4 in __lll_lock_wait ()   from /lib64/libpthread.so.0   4 Thread 0x42838940 (LWP 6723)  0x0000003d1a80d4c4 in __lll_lock_wait ()   from /lib64/libpthread.so.0   3 Thread 0x43239940 (LWP 6724)  0x0000003d19c9a541 in nanosleep ()  from /lib64/libc.so.6   2 Thread 0x43c3a940 (LWP 6725)  0x0000003d19c9a541 in nanosleep ()  from /lib64/libc.so.6  * 1 Thread 0x2b984ecabd90 (LWP 6721)  0x0000003d1a807b35 in pthread_join ()  from /lib64/libpthread.so.0 



Listing 7. Switch to the output of thread 5

 (gdb) thread 5  [Switching to thread 5 (Thread 0x41e37940 (LWP 6722))]#0  0x0000003d1a80d4c4 in  __lll_lock_wait () from /lib64/libpthread.so.0  (gdb) where  #0  0x0000003d1a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0  #1  0x0000003d1a808e1a in _L_lock_1034 () from /lib64/libpthread.so.0  #2  0x0000003d1a808cdc in pthread_mutex_lock () from /lib64/libpthread.so.0  #3  0x0000000000400a9b in func1 () at lock.cpp:18  #4  0x0000000000400ad7 in thread1 (arg=0x0) at lock.cpp:43  #5  0x0000003d1a80673d in start_thread () from /lib64/libpthread.so.0  #6  0x0000003d19cd40cd in clone () from /lib64/libc.so.6 

Listing 8. Output of thread 4 and thread 5

 (gdb) f 3  #3  0x0000000000400a9b in func1 () at lock.cpp:18  18          pthread_mutex_lock(&mutex2);  (gdb) thread 4  [Switching to thread 4 (Thread 0x42838940 (LWP 6723))]#0  0x0000003d1a80d4c4 in  __lll_lock_wait () from /lib64/libpthread.so.0  (gdb) f 3  #3  0x0000000000400a17 in func2 () at lock.cpp:31  31          pthread_mutex_lock(&mutex1);  (gdb) p mutex1  $1 = {__data = {__lock = 2, __count = 0, __owner = 6722, __nusers = 1, __kind = 0,  __spins = 0, __list = {__prev = 0x0, __next = 0x0}},   __size = "\002\000\000\000\000\000\000\000B\032\000\000\001", '\000' <repeats 26 times>, __align = 2}  (gdb) p mutex3  $2 = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,  __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},  __size = '\000' <repeats 39 times>, __align = 0}  (gdb) p mutex2  $3 = {__data = {__lock = 2, __count = 0, __owner = 6723, __nusers = 1,  __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},   __size = "\002\000\000\000\000\000\000\000C\032\000\000\001", '\000' <repeats 26 times>, __align = 2}  (gdb) 

From the above we can find that thread 4 is trying to get the lock mutex1, but the lock mutex1 has been obtained by the thread with LWP 6722 (_ owner = 6722), thread 5 is trying to get the lock mutex2, but the lock mutex2 has been obtained by the LWP of 6723 (_ owner = 6723). From the pstack output, we can find that LWP 6722 corresponds to thread 5, LWP 6723 corresponds to thread 4. So we can conclude that there is a deadlock between thread 4 and thread 5. Check the source code of the thread and find that both thread 4 and thread 5 Use mutex1 and mutex2, and the application order is unreasonable.

Summary

This article briefly introduces a method to analyze the deadlock problem on the Linux platform, which can be used to analyze some deadlocks. I hope to help you. Understanding the cause of the deadlock, especially the four necessary conditions for the deadlock, can avoid, prevent and remove the deadlock as much as possible. Therefore, in terms of system design and process scheduling, pay attention to how to prevent these four necessary conditions from being established and how to determine reasonable resource allocation algorithms to avoid permanent occupation of system resources by processes. In addition, it is necessary to prevent the process from occupying resources while waiting. During system operation, it dynamically checks the resource applications that each system can meet by the process, determine whether to allocate resources based on the check results. If the system may experience a deadlock after allocation, no allocation will be made; otherwise, the allocation will be made. Therefore, resource allocation should be reasonably planned. The use of ordered resource allocation methods and bankers algorithms is an effective way to avoid deadlocks.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.