Program debugging technology-solving Stack Overflow

Source: Internet
Author: User

Preface

The most painful thing for programmers is getting stuck in the bug Quagmire, and I have never been wrestling with it. Here, I will sum up some of my lessons, including endless loops, deadlocks, memory leaks, and memory access errors. If it can help my friends, that would be even better. However, I don't plan to write these articles in a step-by-step manner. Instead, I want to write a complete series at the end.

This section uses a real example to describe how to debug the "stack overflow" error in the vc6 environment.

Problems emerging

I am responsible for maintaining a DLL developed by my former colleague. This is a middleware for network communication. Today, when the application is disconnected from the server, an error is reported suddenly, and it has been tried and tested. Errors found during debugging always occur in the destructor of a class. The similar code is as follows:

Class bar
{
Public:
~ Bar ()
{
Stringstream SS;
SS <"~ Bar "<123;
Cout <ss. STR (); // Error
}
};

When an error occurs, the function call stack is:

Memcpy (unsigned char * 0x1da822a9, unsigned char * 0x00000000, unsigned long 1) line 331
STD: char_traits <char >:: copy (char * 0x1da822a9, const char * 0x00000000, unsigned int 1) line 194 + 20 bytes
STD: basic_string <char, STD: char_traits <char>, STD: Allocator <char >:: assign (const char x 0x00000000, unsigned int 1) line 134 + 20 bytes
STD: basic_string <char, STD: char_traits <char>, STD: Allocator <char >:: basic_string <char, STD: char_traits <char>, STD :: allocator <char> (const char * 0x00000000, unsigned int 1, const STD: Allocator <char> &{...}) line 48 + 43 bytes
STD: basic_stringbuf <char, STD: char_traits <char>, STD: Allocator <char >:: STR () Line 36 + 73 bytes
STD: basic_stringstream <char, STD: char_traits <char>, STD: Allocator <char >:: STR () line 262 + 31 bytes

It can be seen that when basic_stringbuf executes STR (), it passes the NULL pointer to the basic_string constructor, which eventually leads to a later memcpy error. In theory, the pointer here should not be null. Although the STL inventory used by vc6 has many problems, such code should not go wrong in any way! Is the memory overwritten? However, from the value of other member variables, this possibility is very small.

There are many doubts. To find out the truth, I wrote the previous code to debug and analyze STL. Everyone on Earth knows that the STL written by P. J. plauger is a mess. I don't know how he remembers the variable names, and the code layout is also a mess. It took me half a morning to figure it out. The conclusion was that there was no problem with STL, so I also used Google to confirm whether there was the same problem.

This makes me even more depressed. If it is memory coverage, it will be very troublesome, because based on past experience, it is more difficult to find out where the memory is rewritten in such a large multi-threaded program. (It turns out that I made a small mistake here, that is, ignoring the depth of the function call stack)

Transfer occurs

After the incident, a turning point occurred. The same program gave an important message: "unhandled exception xxx.exe: 0xc00000fd: Stack Overflow." When another colleague's machine reported an error .".

This indicates the problem, but I still have a small question. Why is there no such error prompt on my machine? No prompt, maybe it is a problem in the system environment. I have a way to make the problem happen, so F5 runs the program, select "Debug/exceptions" from the menu, and find "stack overflow" in the list box ", change Action to "stop always" as follows:

After the same operation, the "stack overflow" exception occurred.

Next, I will focus on the analysis of the function call stack. Press the shortcut key Alt + F7 to call up the call Stack window (or choose View/debug Windows/call stack from the menu). It can be seen that the call stack is really deep, because the content displayed in the window is limited, the first function cannot be found.

Problems Found

At first, I thought there was an endless loop, but I did not find any problems after checking it carefully. The program was executed normally. After further in-depth analysis of the Call Stack and related code, I finally found a clue. The original program uses the smart pointer in boost to construct a message queue. When the message queue is too long, the final analysis will cause a deep invocation level. Therefore, I wrote the following code for testing:

# Include <iostream>
Using namespace STD;

# Include <boost/shared_ptr.hpp> // use the smart pointer of the boost Library
Using namespace boost;

Struct message
{
Message (INT Index = 0)
{
M_index = index;
}
~ Message () // destructor
{
Cout <"~ Message: "<m_index <Endl;
}

Shared_ptr <message> m_pnext; // point to the next message in the Message Queue
Int m_index;
};

Int main (INT argc, char * argv [])
{
Shared_ptr <message> phead = shared_ptr <message> (New message (0 ));
Shared_ptr <message> pcur = phead;
For (INT I = 1; I <2000; ++ I) // construct a 2000 Message Queue Column
{
Shared_ptr <message> pnext = shared_ptr <message> (& n bsp; new message (I ));
Pcur-> m_pnext = pnext;
Pcur = pnext;
}
Return 0;
}

After compilation, F5 runs and "stack overflow" appears, so my mind is relaxed.

If you don't try it yourself, it's hard to notice the problem with the above Code. In fact, this is indeed a very normal c ++ program. There is absolutely no problem in the logic analysis. How does the error happen?

Starting with the destructor of C ++, let's review the following knowledge:

1. The Destructor will be automatically called at the end of the object's life cycle.
2. classes that contain member variables call their destructor in reverse order after their own destructor call is completed.
3. classes with an inheritance system will call their destructor in reverse order according to the Declaration Order of the base class after the calling of their own destructor ends.

Then, analyze the previous Code. When the main function is executed to return 0, the destructor of the smart pointer pcur will be called. This is no problem, and the phead destructor will be called, because the reference count of the message (1) pointed to by phead is reduced to 0, it is released, and its destructor is called. According to Rule 2, next message (1 ). m_pnext will analyze the structure and start the Message Queue traversal. Below is a periodic function call Stack:

Message ::~ Message () line 53 + 8 bytes
Message: 'scalar deleting destructor '(unsigned int 0x00000001) + 37 bytes
Boost: checked_delete (Message * 0x0044e920) line 34 + 28 bytes
Boost: checked_deleter <message >:: operator () (Message * 0x0044e920) line 52 + 9 bytes
Boost: detail: sp_counted_base_impl <message *, boost: checked_deleter <message >:: dispose () line 265
Boost: detail: sp_counted_base: release () line 147 + 13 bytes
Boost: detail: shared_count ::~ Shared_count () line 382.
Boost: shared_ptr <message> ::~ Shared_ptr <message> () + 40 bytes

This is all so reasonable, but we cannot ignore the size of the function stack space. After learning the compilation principle, you will know that the function stack space is the memory area used to store local variables, function return addresses, function parameters, and other data, its size is limited (vc6 is 1 MB by default ). When the local variable occupies too much space or the function call level is too deep, "stack overflow" may occur. The most common errors are:

1. the space of local array variables is too large, as shown below:

Int main (INT argc, char * argv [])
{
Char stack_overflow [1024*1024*5];
Stack_overflow [0] = 1;
Return 0;
}

There are two solutions to this problem. One is to increase the stack space (which is described in detail later), and the other is to use dynamic allocation instead of heap ).

2. Infinite recursive call of the function, as shown below:

Void infinite_loop ()
{
Infinite_loop ();
}

Int main (INT argc, char * argv [])
{
Infinite_loop ();
Return 0;
}

In practical applications, no one will directly write such stupid code, but this result is often caused by carelessness or mutual calls between functions. The solution is to eliminate bugs.

Solution

Looking back, let's look at our problem. The reason is mainly because we ignore the influence of boost smart pointers in the analysis structure (this is indeed easy to tolerate ignoring ). After understanding the problem, there are many solutions to the problem, as shown below:

1. Increase stack space

Call out the "Project/settings/Link" tab and select output. The reserve value of stack allocations is the size of stack space (see). The default value is 1 MB in vc6, you can increase it based on the actual situation and re-compile it. For details, see the/stack option in msdn. This solution is simple and feasible for some problems, but it cannot meet my requirements here.

Here, I would like to mention more. There is also a simpler way to increase the stack space, that is, to use the editbin tool attached to VC, which can directly increase the stack space of executable programs, instead of re-compiling the program, use the following method:

Editbin/Stack: Reserve [, commit] [files]

2. Limit the queue length

This method is not feasible because it cannot meet my application requirements.

3. use other methods to implement message queues

That is, without using boost: shared_ptr, use the original pointer or STD: list to build a message queue. However, the model of my program is much more complex than the test code given above, and involves other factors, so this method is not feasible.

4. Disconnect the Message Queue

This is my final solution. The reason why the iteration occurs during the analysis is the reference counting principle of Boost: shared_ptr. As long as the message chain is disconnected, there will be no such problem. For the previous test example, you only need to add the following code before return 0 to avoid "stack overflow ":

Pcur = phead;
For (I = 1; I <2000; ++ I) // disconnect the message chain in sequence
{
Phead = pcur-> m_pnext;
Pcur-> m_pnext.reset ();
Pcur = phead;
}

Summary

Although the content in this section is based on a practical example, its solution is still general, such as increasing the stack space through the link option and increasing the executable file stack space using editbin.

In addition, we should pay special attention to the analytical processing of the queues built by smart pointers.

(Freefalcon on)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.