Application debugging summary, application debugging
Summarize the basics and debugging methods when the segment fault occurs in the application. The knowledge comes from the debug hacks book.
Environment, x86 32-bit linux
I. Basics
1. Familiar with parameter transmission methods.
Before entering the called function, the program will press the stack in the order of parameters, return address, fp pointer (frame pointer), local variables of the called function, and so on.
Source code:
# Include <stdio. h>
Int fun (int a, char c)
{
Printf ("% d \ n % c \ n", a, c );
Return;
}
Int main ()
{
Fun (1, 'A ');
Return 0;
}
Use gdb to debug the program:
Add the * sign before the function name. When a program encounters a breakpoint, It is stuck at the beginning of the function assembly language level. If you do not add *, it will stop in the first sentence of the function. Before a function is redirected, the variables and return addresses to be passed are pushed to the stack, while the remaining variables are pushed to the stack by the called function. So at this time, the sp Pointer Points to the return address, and the other we know that the stack is growing downward, so sp + 4 is the 2nd parameter pushed in (), sp + 8 is the 1st pushed parameters (c), for example. 2. Generate core files. Generally, core files are not generated by default in linux. You can view them through ulimit-c. If 0 is displayed, call ulimit-c unlimited to set it to no upper limit. You can also set a specific value in blocks. Note: Make sure you have the permission to generate core files in this directory, because we usually mount local files to linux servers or virtual machines, if a user is not authorized to log on, the core file is not generated in this directory, or the size of the generated core file is 0. 3. Common commands of gdb can be viewed in my previous summary. Ii. debugging practices 1. Stack Overflow source code:
# Include <stdio. h>
Int fun ()
{
Int a = 10;
Fun ();
Printf ("% d \ n", );
Return 1;
}
Int main (int argc, char ** argv)
{
Fun ();
Return 0;
}
When a segment error occurs, use the generated core file to view the sp pointer size. You can see sp = 0xbf45a000. Then, check the size of each segment and run the I files command, although I cannot see which segment is a stack, ps: I don't know why I can't upload images. Then I typed. As follows:
Local core dump file:
'/Root/core', file type elf32-i386.
0x0084e000-0x0084e000 is load1
0x009a1000-0x009a1000 is load2
0x009a2000-0x009a4000 is load3
0x009a4000-0x009a5000 is load4
0x009a5000-0x009a8000 is load5
0x00d68000-0x00d69000 is load6
0x00d87000-0x00d87000 is load7
0x00da2000-0x00da3000 is load8
0x00da3000-0x00da4000 is load9
0x08048000-0x08048000 is load10
0x08049000-0x0804a000 is load11
0x0804a000-0x0804b000 is load12
0xb775e000-0xb775f000 is load13
0xb776d000-0xb776f000 is load14
0xbf45a000-0xbfe5a000 is load15
It can be seen that 0xbf45a000 belongs to segment 15 and is clearly at the end of this segment. Because sp auto-subtraction does not check whether the sp is out of the range, it will know whether the address is valid only when it is accessed, therefore, stack overflow can be determined.
Many large programs will receive this signal when the program throws a segment error signal, but there is no space on the stack at this time, it is impossible for the processing function to end normally. Therefore, you need to apply for a stack space for the function in advance to ensure that the current situation can be preserved. You can use the sigaltstack function to apply for a backup stack on the stack. For more information, see man.
2. The return address is modified.
There are many situations in which the returned address is modified. According to the previous stack space pressure stack sequence, if the partial array of the called function is out of bounds, the returned address can be overwritten, resulting in a segment error, this is one type. The key point is how we know that the returned address is modified, and the local variables may be incorrect at this time, which is difficult to debug. In general, if the return address is modified, the information in bt will be like this.
We know that under normal circumstances, the function name should be displayed rather than the question mark (if the modified address still points to a function, we can only view it step by step, whether such a call order exists ). You can confirm that the return address has been modified.
Specifically, if the returned address is modified because the array is out of bounds.
Source code:
# Include <stdio. h>
# Include <string. h>
Char names [] = "book cat dog building vagetable curry ";
Void fun ()
{
Char buf [5];
Strcpy (buf, names );
}
Int main (int argc, char ** argv)
{
Fun ();
Return 0;
}
Debugging process: first, check which statements are currently running.
It can be seen that the ret statement is currently run, that is, the returned value. Then we can see the value in sp.
This step is redundant, that is, the next frame address in the stack information.
Because the stack information is suspected that the return address has been modified, you can view the content in esp in the form of a string.
Obviously, the information in the stack is that book cat dog building vagetable curry is obviously a string. You can search for this string to be referenced and find it in the source code line 8th, the length of the array is exceeded when the string is copied.
3. Use monitoring points to detect illegal Memory Access
This cannot be repeated in linux, because the address value after the cross-border operation is invalid, it is difficult to simulate this situation. So the language is described here.
Source program:
Int data [2] = {1, 2 };
Int calc (void)
{
Return-7;
}
Int main ()
{
Int index = calc ();
Data [index] = 0x0a;
Data [index + 1] = 0x08;
Printf ("ssssss \ n ");
Return 0;
}
The error occurs in printf. Find the return address in the main function by viewing the stack, and the statement before the return address may cause this segment error. Then, we can see that there is a call statement in the previous statement to track this statement, it will jump to the address in a pointer. In fact, the address in the pointer is 0x08, which is modified by the statement in the program, then the key point is how to determine the error caused by this sentence.
Now that we know the address pointed to by this pointer, we can set the monitoring point at this address value. When the value at this address is modified, gdb will stop, the first sentence of printf is found during the running, that is, the reason.
4. bug caused by double release pointer
In this case, I think we can set monitoring points or breakpoints. We can use the gdb script to print the stack information for free, and then check which address has been released multiple times.
Another method is to use env MALLOC_CHECK _ = 1. /. in some cases, the environment variable is not specified, and the stack information is printed when the pointer is released. Instead, the stack information is not printed when the environment variable is added. However, I personally think this is only because of the double release, or stick to the previous method, find the two release locations and keep only one release point.
5. deadlock
When a deadlock occurs, run the ps command to check the thread status. If the thread status is S, it may indicate a deadlock.
In this case, use gdb attch to check the stack of each thread and which thread is stuck.
Then, we use gdb to set breakpoints and scripts to print out the process of the same lock operation. The following is an example.
Source code:
# Include <stdio. h>
# Include <stdlib. h>
# Include <pthread. h>
Pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
Int cnt = 0;
Void cnt_reset (void)
{
Pthread_mutex_lock (& mutex );
Cnt = 0;
Pthread_mutex_unlock (& mutex );
}
Void * th (void * p)
{
While (1 ){
Pthread_mutex_lock (& mutex );
If (cnt> 2)
Cnt_reset ();
Else
Cnt ++;
Pthread_mutex_unlock (& mutex );
Printf ("% d \ n", cnt );
Sleep (1 );
}
}
Int main ()
{
Pthread_t id;
Pthread_create (& id, 0, th, 0 );
Pthread_join (id, 0 );
Return 0;
}
Running result:
[Root @ ubuntu: deadlock]./a. out
1
2
3
If the program does not run, it should be printed as 0 according to the program.
[Root @ ubuntu: deadlock] ps-x | grep a. out
Warning: bad ps syntax, perhaps a bogu '-'? See http://procps.sf.net/faq.html
26418 pts/9 Sl +./a. out
We can see that the program is currently in sleep state, so use gdb attch to check which thread is sleeping or causing a deadlock.
It can be seen that the main thread is sleeping, waiting for the end of the sub-thread, and the sub-thread is sleeping on the release of the waiting lock, the problem is that the current thread cannot get the lock because of the step in which the lock is obtained or the thread first.
Use gdb to re-debug the program, set a breakpoint at the lock and lock release locations, and print out the stack. It can be found that the previous steps are always locked and unlocked, in the last one, both operations are locked.
According to the stack information, we can know that the th function adds a lock first, and then the th function itself calls the cnt_reset function. This function locks again, causing a deadlock.
So now we can find the reason.
This is a concise example. I encountered a troublesome problem at work. For access to a data structure between multiple threads, you must first obtain a lock to protect the structure, the problem is that when a thread does not release the lock after it gets the lock, the thread is killed, and other threads can no longer obtain the lock, resulting in all threads being blocked. You can also find the cause through the above method.
6. endless loop
In this case, I copied the example in the book and created a similar example.
Source code:
# Include <stdio. h>
Int fun (char * p, int len)
{
While (len> 0 ){
Int version = * (int *) p;
Int msgtype = * (int *) (p + sizeof (int ));
Int length = * (int *) (p + sizeof (int ));
/* Do something */
Len = len-length;
P = p + length;
}
}
Int main ()
{
Char p [100];
Int len = 0;
Int version = 1;
Int type = 10;
Int length = 0;
Memset (p, 0,100 );
Memcpy (p, & version, 4 );
Memcpy (& p [4], & type, 4 );
Memcpy (& p [8], & length, 4 );
Length = 10;
Memcpy (& p [12], & version, 4 );
Memcpy (& p [16], & type, 4 );
Memcpy (& p [20], & length, 4 );
Fun (p, 30 );
Return 0;
}
The fun function is a function used to parse messages. Some data packets are parsed in a stream-based manner similar to tcp.
However, an endless loop occurs during running. That is, the program will not exit after being executed.
After the process is executed on gdb attach and found to be in the fun function, there is only one loop for viewing the source code to know fun. Now we use the executable program of the debug version to debug the program in one step.
It can be found that the length of the message body is always 0, which causes this problem and is constantly parsing the same message. The problem is fixed. The length of the sent message is incorrect. Therefore, when parsing the length field in the function, we should compare the length field with at least the number.
3. Summary
First, you must be familiar with various tools in gdb, including viewing registers, stacks, breakpoints, monitoring points and scripts.
Generally, the debugging process is to collect information, including phenomena and dump information. Analyze dump information, reproduce bugs, and fix bugs.
Stack Overflow: Combined with sp and Program map Information.
The return address is modified: the stack exception is basically because the return address is modified. The content in the sp is printed in a variety of ways, such as hexadecimal or hexadecimal. You may find that the desired result is printed. For example, if it is a string, it is easy to locate the error.
Illegal Memory Access: A jump address is stored in a pointer, and the value of this pointer is modified, which leads to illegal subsequent jumps. In this case, you can set a monitoring point on the pointer to print the stack to access the monitoring point.
Dual release: Determine the two releases by using monitoring points or breakpoints.
Deadlock: Same as above. Determine which two steps conflict with the lock.
Dead loop: determine the current location of the dead loop. It is best to use the debug version for single-step debugging.