LINUX multi-thread debugging (GDB)

Source: Internet
Author: User
Tags perl script

GDB is a common C/C ++ debugging tool in Linux and has powerful functions. How can I use GDB for debugging complex systems, such as multi-process systems? Consider the following three-Process System:

Proc2 is a subprocess of proc1, and proc3 is a subprocess of proc2. How can I use GDB to debug proc2 or proc3?

In fact, GDB does not directly support multi-process program debugging. For example, if you use GDB to debug a process that fork a child process, GDB will continue to debug the process and the child process will run without interference. If you set a breakpoint in the sub-process code in advance, the sub-process will receive the sigtrap signal and terminate. How can we debug the sub-process? In fact, we can use the features of GDB or other auxiliary means to achieve our goal. In addition, GDB also adds some multi-process debugging support to newer kernels.

Next we will introduce several methods in detail, including the Follow-fork-mode method, attach sub-process method, and GDB wrapper method.

Follow-fork-Mode

In the Linux kernel version 2.5.60 and later versions, GDB provides the follow-fork-mode option for programs that use fork/vfork to create sub-processes to support multi-process debugging.

The usage of follow-fork-mode is as follows:

Set follow-fork-mode [Parent | child]

  • Parent: Continue to debug the parent process after fork. The child process is not affected.
  • Child: debug the child process after fork. The parent process is not affected.

Therefore, if you need to debug the sub-process, after you start GDB:

(GDB) set follow-fork-mode child

 

Set breakpoints in the sub-process code.

In addition, the detach-on-fork parameter is used to indicate whether or not GDB disconnects debugging of a process after fork, or whether it is controlled by GDB:

Set detach-on-fork [ON | Off]

  • On: disconnect the process specified by follow-fork-mode.
  • Off: GDB controls the parent process and child process. The process specified by follow-fork-mode will be debugged, And the other process will be placed in the suspended state.

Note that it is best to use GDB 6.6 or later. If you are using gdb6.4, only the follow-fork-mode is available.

Follow-fork-mode/detach-on-fork is relatively simple to use, but due to its system kernel/GDB version restrictions, we can only use it on a compliant system. In addition, the debugging of follow-fork-mode must start from the parent process. For fork, the system of the Sun process or great sun process appears many times, such as the 3 process system, debugging is not convenient.

Attach sub-process

As we all know, GDB has the attach function to a running process, that is, the attach <pid> command. Therefore, we can use this command to attach the sub-process and then debug it.

For example, to debug a process rim_oracle_agent.9i, first obtain the PID of the process.

[Root @ tivf09 Tianq] # ps-Ef | grep rim_oracle_agent.9i

Nobody 6722 6721 0? 00:00:00 rim_oracle_agent.9i

Root 7541 27816 0 00:00:00 pts/3 grep-I rim_oracle_agent.9i

 

Pstree shows that this is a three-process system, oserv is the parent process of rim_oracle_prog, and rim_oracle_prog is the parent process of rim_oracle_agent.9i.

[Root @ tivf09 root] # fig 6722

View processes through pstree

Start GDB and attach to the Process

Use GDB to connect to a process

Now you can debug it. A new problem is that the sub-process has been running and the attach does not know where to run. Is there a solution?

One way is to add a special code to the initial code of the sub-process to be debugged, such as at the beginning of the main function, so that the sub-process cyclically sleeps and waits when a condition is set, after attach to the process, set a breakpoint after the code segment, and then cancel the established conditions so that the code can continue to be executed.

As for the conditions used in this code, you are biased. For example, we can check the value of a specified environment variable, or check that a specific file does not exist. Take the file as an example. The format can be as follows:

Void debug_wait (char * tag_file)

{

While (1)

{

If (tag_file exists)

Sleep for a period of time;

Else

Break;

}

}

 

After attach arrives at the process, set a breakpoint after the code segment and delete the file. Of course, you can also use other conditions or forms, as long as this condition can be set/detected.

The attach process method is very convenient. It can cope with a variety of complex process systems, such as the Sun Tzu/Zeng sun process, such as daemon process ), the only requirement is to add a small piece of code.

GDB wrapper

Most of the time, the parent process fork goes out of the child process, and the child process then calls the exec family function to execute new code. In this case, we can also use the gdb wrapper method. It does not need to add additional code.

The basic principle is to use GDB to call the code to be executed as a new whole to be executed by the exec function, so that the code to be executed is always under the control of GDB, in this way, we can naturally debug the sub-process code.

In the above example, after the rim_oracle_prog fork goes out of the sub-process, it will then execute the binary code file rim_oracle_agent.9i. Rename the file to rim_oracle_agent.9i.binary, and create a new shell script file named rim_oracle_agent.9i. Its content is as follows:

[Root @ tivf09 bin] # mv rim_oracle_agent.9i rim_oracle_agent.9i.binary

[Root @ tivf09 bin] # Cat rim_oracle_agent.9i

#! /Bin/sh

GDB rim_oracle_agent.binary

 

When the Fork sub-process executes a file named rim_oracle_agent.9i, GDB is started first, so that the code to be debugged is under the control of GDB.

A new problem arises. Sub-processes are controlled by GDB, but still cannot be debugged: how to interact with GDB? We must start GDB in some way to interact with GDB in a certain window/terminal. Specifically, you can use xterm to generate this window.

Xterm is a simulated terminal program in the X Window System. For example, we typed the xterm command in GNOME of Linux:

Xterm

A terminal window will pop out:

 

Terminal

If you are debugging on a remote Linux server, you can use VNC (Virtual Network Computing) viewer to connect to the server from a local machine and use xterm. Before that, you need to install VNC Viewer on your local machine, install and start the VNC server on the server. Most Linux distributions are pre-installed with the VNC-server package, so we can directly run the vncserver command. Note: When you run vncserver for the first time, you will be prompted to enter the password to use as the password for VNC Viewer to connect from the client. You can use the vncpasswd command on the VNC server to change the password.

[Root @ tivf09 root] # vncserver

 

New 'tivf09: 1 (Root) 'desktop is tivf09: 1

 

Starting applications specified in/root/. VNC/xstartup

Log File is/root/. VNC/tivf09: 1.log

 

[Root @ tivf09 root] #

[Root @ tivf09 root] # ps-Ef | grep-I VNC

Root 19609 1 0 jun05? 00:08:46 xvnc: 1-desktop tivf09: 1 (Root)

-Httpd/usr/share/VNC/classes-auth/root/. Xauthority-geometry 1024x768

-Depth 16-rfbwait 30000-rfbauth/root/. VNC/passwd-rfbport 5901-PN

Root 19627 1 0 jun05? 00:00:00 vncconfig-iconic

Root 12714 10599 0 000:00:00 pts/0 grep-I VNC

[Root @ tivf09 root] #

 

Vncserver is a Perl script used to start xvnc (x VNC Server ). X client applications, such as xterm and VNC Viewer, communicate with each other. As shown above, we can use the display value tivf09: 1. Now you can use VNC Viewer to connect to the local machine:


VNC Viewer

Next we will modify the rim_oracle_agent.9i script to make it look like the following:

#! /Bin/sh

Export display = tivf09: 1.0; xterm-e GDB rim_oracle_agent.binary

 

If your program also passes in parameters during exec, you can change it:

#! /Bin/sh

Export display = tivf09: 1.0; xterm-e GDB -- ARGs rim_oracle_agent.binary $ @

 

Add the execution permission.

[Root @ tivf09 bin] # chmod 755 rim_oracle_agent.9i

 

Now you can debug it. Programs that run the promoter process:

[Root @ tivf09 root] # wrimtest-l 9i_linux

Resource Type: Rim

Resource label: 9i_linux

Host Name: tivf09

User name: mdstatus

Vendor: Oracle

Database: Rim

Database Home:/data/Oracle9i/920

Server ID: Rim

Instance home:

Instance name:

Opening regular session...

 

The program stopped. From the VNC Viewer, we can see that a new GDB xterm window opens on the server.

GDB xterm window

[Root @ tivf09 root] # ps-Ef | grep GDB

Nobody 24312 24311 0? 00:00:00 xterm-e GDB rim_oracle_agent.binary

Nobody 24314 24312 0 00:00:00 pts/2 GDB rim_oracle_agent.binary

Root 24326 10599 0 00:00:00 pts/0 grep GDB

 

It is the program to be debugged. Set the breakpoint and start debugging!

Note: The following errors are generally about permissions. Use the xhost command to modify permissions:

Xterm Error

[Root @ tivf09 bin] # export display = tivf09: 1.0

[Root @ tivf09 bin] # xhost +

Access Control Disabled, clients can connect from any host

 

Xhost + prohibits access control and can be connected from any machine. For security concerns, you can also use xhost + <your machine Name>.

Summary

The three methods have their own characteristics and advantages, so they are suitable for different occasions and environments:

  • Follow-fork-mode: it is easy to use and has restrictions on the system kernel and GDB version. It is suitable for simple multi-process systems.
  • Attach sub-process method: flexible and powerful, but additional code needs to be added, suitable for various complex situations, especially the daemon process
  • GDB wrapper method: used for Fork + EXEC mode. No additional code is required, but xterm/VNC is required ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.