How to print a stack and prevent data loss after a process crashes

Source: Internet
Author: User
Tags int size sprintf


When a process encounters a logical error during operation, such as 0, null pointers, and so on, the system triggers a software outage.  This interrupt notifies the process in a signal way that the default processing of these signals is to end the process. When this happens, we think that the process has collapsed.
After the process crashes, we want to know why it crashed, which function, and which line of code caused the error. In addition, before the process exits, we also want to do some aftercare, such as storing some data in the database, and so on.
Next, I'll introduce some techniques to achieve these two goals.
1. View stack information in core file
If the process crashes, we can see the stack information at that time, and we can quickly navigate to the wrong code. Add the-G option in GCC, and debug information is included in the executable file. When a process crashes, a core file is generated. We can use GDB to view this core file to know the environment when the process crashes.
In the commissioning phase, the core file can bring us a lot of convenience. But in the formal environment, it has a lot of limitations: 1. Executable files that contain debug information can be large. And the speed of operation will be greatly reduced. 2. A core file is often large, and if the process crashes frequently, hard disk resources become tense.
Therefore, a program that runs in a formal environment will not contain debug information. The size of its core file, we will set it to 0, that is, the core file will not be entered. In this context, how do we get stack information for the process?
2. Dynamically acquiring a stack of threads
The C language provides a backtrace function through which the current thread's stack can be dynamically obtained. To use the BackTrace function, there are two points required: 1. The program uses the ELF binary format. 2. The-rdynamic option was used when the program was connected. -rdynamic can be used to tell the linker to add all symbols to the dynamic symbol table, which is much less information than the-G option.
Here is a description of the function you will use: #include <execinfo.h>
int backtrace (void **buffer,int size); is used to get the call stack of the current thread, and the obtained information will be stored in buffer, which is a list of pointers. The parameter size is used to specify how many void* elements can be saved in the buffer.  The function return value is the actual number of pointers that are actually fetched, not exceeding the maximum size note: Some compiler optimizations have interference with the correct call stack, and inline functions do not have a stack frame; Deleting a frame pointer also causes the stack contents to be parsed incorrectly;
char * * backtrace_symbols (void *const *buffer, int size) Converts the information obtained from the BackTrace function into an array of strings. The parameter buffer should be an array of pointers obtained from the BackTrace function, and the size is the number of elements in the array (the return value of the backtrace); The function return value is a pointer to an array of strings, which is the same size as the buffer. Each string contains a printable message relative to the corresponding element in buffer. It includes the function name, the offset address of the function, and the actual return address. The return value of the function is the space requested by the malloc function, so the caller must use the free function to release the pointer. Note: If you cannot get enough space for a string, the return value of the function will be null.
void backtrace_symbols_fd (void *const *buffer, int size, int fd) has the same functionality as the Backtrace_symbols function, unlike that it does not return a string array to the caller, but instead writes the result In a file where the file descriptor is FD, each function corresponds to one row.
3. Capture Signal
We want to print the stack when the process crashes, so we need to capture the corresponding signal. The method is simple.    #include <signal.h> void (*signal int signum,void (* handler) (int)) (int);                    Or: typedef void (*sig_t) (int);    sig_t signal (int signum,sig_t handler);   Parameter description: The first parameter signum indicates the type of signal to be processed, it can be any kind of signal except Sigkill and sigstop. The second parameter, handler, describes the action associated with the signal, which can take the following three kinds of values: 1.  The address of a function that returns a positive value, which is our signal processing function.   This function should have the following form definition: int func (int sig); The sig is the only parameter passed to it. After the signal () call is executed, the process simply receives a signal of type sig, and executes the Func () function immediately, regardless of which part of the program is executing.   When the Func () function completes, control returns to the point at which the process is interrupted. 2. Sigign, ignore the signal.  3. SIGDFL, restore the system to the signal default processing. Return value: Returns the previous signal processing function pointer and returns SIG_ERR (-1) If there is an error.
Note: When the signal processing function of a signal is executed, if the process receives the signal again, the signal is automatically stored without interrupting the execution of the signal processing function until the signal processing function is finished and the corresponding handler function is called again.   If the process receives other types of signals when the signal processing function executes, execution of the function is interrupted. When the signal jumps to the custom handler processing function, the system automatically changes the processing function back to the original system preset, and if you want to change this operation, use Sigaction () instead.
4. Examples
Let's actually encode the details of how to print the process stack after capturing the signal, and then end the process.
#include <iostream> #include <time.h> #include <signal.h> #include <string.h> #include < execinfo.h> #include <fcntl.h> #include <map>
using namespace Std;
Map<int, string> sig_list;
#define SET_SIG (SIG) sig_list[sig] = #sig;
void Setsiglist () {sig_list.clear (); Set_sig (Sigill)//illegal instruction Set_sig (Sigbus)//Bus error Set_sig (SIGFPE)/floating-point exception Set_sig (SIGABRT)//The termination signal from the Abort function Set_sig (SIGSEGV)//Invalid memory reference (segment error) Set_sig (sigpipe)//write to a pipe that does not read the user Set_sig (sigterm)//software termination signal set_sig (SIGSTKFLT)//coprocessor Stack fault Set_sig (SIGXFSZ)//File size exceeds limit Set_sig (sigtrap)//Tracking Trap}
string& getsigname (int sig) {return sig_list[sig];}
void Savebacktrace (int sig) {//Open file time_t tsettime;     Time (&tsettime);     tm* PTM = localtime (&tsettime);     Char fname[256] = {0}; sprintf (fname, "core.%d-%d-%d_%d_%d_%d", ptm->tm_year+1900, Ptm->tm_mon+1, Ptm->tm_mday, ptm->     Tm_hour, Ptm->tm_min, ptm->tm_sec);     file* f = fopen (fname, "a");     if (f = = NULL) {exit (1); int FD = Fileno (f);
Lock file flock FL;     Fl.l_type = F_wrlck;     Fl.l_start = 0;     Fl.l_whence = Seek_set;     Fl.l_len = 0;     Fl.l_pid = Getpid (); Fcntl (FD, F_SETLKW, &AMP;FL);
The absolute path Char buffer[4096] of the output program;     memset (buffer, 0, sizeof (buffer));     int count = Readlink ("/proc/self/exe", buffer, sizeof (buffer));         if (Count > 0) {buffer[count] = ' \ n ';         Buffer[count + 1] = 0;     fwrite (buffer, 1, count+1, f); }
The time of output information memset (buffer, 0, sizeof (buffer));         sprintf (buffer, "Dump time:%d-%d-%d%d:%d:%d\n", ptm->tm_year+1900, Ptm->tm_mon+1, Ptm->tm_mday,     Ptm->tm_hour, Ptm->tm_min, ptm->tm_sec); fwrite (buffer, 1, strlen (buffer), f);
Thread and signal sprintf (buffer, "Curr thread:%d, Catch signal:%s\n", Pthread_self (), Getsigname (SIG). C_str ()); fwrite (buffer, 1, strlen (buffer), f);
    //Stacks     void* DumpArray[256];     int    nsize =    backtrace (DumpArray, &NBSP;256);     sprintf (buffer,  "backtrace rank = %d\n",  nSize);     fwrite (Buffer, 1, strlen (buffer),  f);     if  (nsize > 0) {        char* * symbols = backtrace_symbols (dumparray, nsize);         if  (symbols != null) {             for  (int i=0; i<nsize; i++) {                 fwrite (Symbols[i],  1, strlen (Symbols[i]),  f);              &Nbsp;  fwrite ("\ n",  1, 1, f);             }              free (symbols);         }     }
The file is unlocked and closed, and the process is terminated fl.l_type = F_unlck;     Fcntl (FD, F_SETLK, &AMP;FL);     Fclose (f); Exit (1); }
void Setsigcatchfun () {map<int, string>::iterator it;     For (It=sig_list.begin (); It!=sig_list.end (); it++) {signal (It->first, savebacktrace); } }
void Fun () {int a = 0; int b = 1/a; }
Static void* Threadfun (void* Arg) {Fun (); return NULL; }
int main () {setsiglist (); Setsigcatchfun ();
printf ("Main thread id =%d\n", (pthread_t) pthread_self ());     pthread_t pid;     if (pthread_create (&pid, NULL, threadfun, NULL)) {exit (1); printf ("Fun thread id =%d\n", PID);
for (;;)     {sleep (1); return 0; }
File name is Bt.cpp compiled: g++ bt.cpp-rdynamic-i/usr/local/include-l/usr/local/lib-pthread-o BT
The main thread created the fun thread, and fun threads have a 0 error, the system throws the SIGFPE signal. This signal causes the fun thread to interrupt, our registered Savebacktrace function captures the signal, prints the relevant information, and then terminates the process. In the output of the core file, we can see the simple stack information.
5. Rehabilitation Treatment
In the above example, the fun thread is interrupted by the SIGFPE, and the Savebacktrace function is executed instead.  At this point, the main thread is still running correctly. If we put the Savebacktrace function to the last exit (1); Replace with a for (;;) Sleep (1); The main thread can continue to run normally. With this feature, we can do a lot of other things.
The game's server processes often have these threads: network threads, database threads, business processing threads. Code that throws a logical error is often located in a business processing thread. and

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.