1. Null pointer
2. Wild Hands
3. Array out of bounds
4. Integer divided by 0
5. Formatted output parameter error
6. Buffer Overflow
7, the initiative throws the exception
The crash on Android can be divided into two types:
1. Java Crash
The Java code causes the JVM to exit and the "program has crashed" dialog box pops up, and the end user clicks Close and the process exits.
Logcat will output Java's call stack under "androidruntime" tag.
2, Native Crash
Using the NDK, the development of C/D + + resulted in the process receiving an error signal, the process exited before Crash,android 5.0 (flashback), and Android 5.0 then bounced the "Program crashed" dialog box.
The Logcat will output the dump message under "DEBUG" tag:
Error Signal : 11 is the semaphore SIGNUM,SIGSEGV is the name of the signal, Segv_maperr is a type under SIGSEGV.
Register Snapshot : A register snapshot saved when the process receives an error signal, where the PC register stores the next instruction to be run (where the error occurred).
Call stack : #00是栈顶, #02是栈底, #02调用 #01 call #00 method, # 00 method when libspirit.so in the Spirit class under the Testcrash method, where the error is Testcrash method within the assembly offset 17 (not the line number Oh!). )
second, what is the wrong signal
Android is essentially a Linux, the signal and the Linux signal is the same thing, the signal itself is used for inter-process communication is not the correct error, but the official to some signal given a specific meaning and specific processing action,
Usually we say that there are 5 error signals (bugly All can be reported), the system default processing is to dump out the stack, and exit the process:
The usual sources are three:
Hardware exception, that is, the hardware (usually the CPU) detects an error condition and notifies the Linux kernel that the kernel handles the exception and sends a signal to the appropriate process. Examples of hardware exceptions include executing an abnormal machine language instruction, such as being removed by 0, or referencing an inaccessible area of memory. Most signals are not processed by the process, and the default action is to kill the process. In this article, SIGSEGV (segment error), Sigbus (Memory access error), SIGFPE (arithmetic exception) belongs to this signal.
The library that the process called finds an error, sends itself an abort signal, and by default the signal terminates the process. In this article, the SIGABRT (abort process) belongs to this signal.
The user (hand-cheap) or third-party app (malicious) through the kill-signal PID way to the error process sent, signal in the Si_code will be less than 0.
three, shaking a few common mistakes
1. Null pointer
code example
int* p = 0; //空指针*p = 1; //写空指针指向的内存,产生SIGSEGV信号,造成Crash
Cause analysis
In the address space of a process, the permission of the first page starting at 0 is set to unreadable and not writable, and when the instruction of the process attempts to access the address in the page (such as reading the memory pointed to by a null pointer), the processor generates an exception and the Linux kernel sends a segment error signal to the process (SIGSEGV ), the default action is to kill the process and produce the core file.
Workaround
The pointer is judged before it is used, and is not accessible if it is empty.
Bug comments
A null pointer is a bug that can easily occur when the code is large enough to catch up with development progress, but it can also be easily discovered and repaired.
2. Wild Hands
code example
int* p; //野指针,未初始化,其指向的地址通常是随机的*p = 1; //写野指针指向的内存,有可能不会马上Crash,而是破坏了别处的内存
Cause analysis
The wild pointer points to an invalid address, which, if unreadable, is crash (the kernel sends a segment error signal to the process SIGSEGV), and the bug is quickly discovered.
If the address being accessed is writable and the memory is modified by a wild pointer, it is likely that crash will occur for a period of time (after the other code has used the memory). At this point the call stack displayed when viewing crash, and the code part where the wild pointer resides, is likely to have virtually no association.
Workaround
When a pointer variable is defined, be sure to initialize, especially in struct or class member pointer variables.
After releasing the memory pointed to by the pointer, set the pointer to null (but this is not a good idea if there is a pointer pointing to that memory somewhere else).
The problem of memory corruption caused by wild pointers, sometimes it is difficult to look at the code to find, through the Code analysis tool is also difficult to find, only through professional memory detection tools to find such bugs.
Bug Comments
Wild pointer bugs, especially memory corruption problems, sometimes look for no clue, no clue, so that developers feel very dazed and helpless (bugly escalated stack see no problem). It can be said that the memory corruption bug is the biggest killer of server stability, and is one of the biggest disadvantage in developing applications compared to other languages (such as Java, C #).
3. Array out of bounds
code example
int arr[10];arr[10] = 1; //数组越界,有可能不会马上Crash,而是破坏了别处的内存
Cause analysis
Array out of bounds and the wild pointer, access to an invalid address, if the address is not read and write, it will be immediately crash (the kernel to the process send a segment error signal SIGSEGV), if the memory is modified, causing memory corruption, it is likely to wait a while to occur elsewhere crash.
Workaround
All loops of the array traversal are added to the cross-border judgment.
When you use subscript to access an array, determine whether it is out of bounds.
The Code analysis tool allows you to discover the vast majority of array cross-border issues.
Bug Comments
Array out of bounds is also a memory corruption bug, and sometimes it is as hard to find as a wild pointer.
4. Integer divided by 0
code example
int a = 1;int b = a / 0; //整数除以0,产生SIGFPE信号,导致Crash
Cause analysis
Integers divided by 0 always produce sigfpe (floating-point exceptions, which do not necessarily involve floating-point arithmetic when generating SIGFPE signals, and integer arithmetic exceptions also use floating-point exception signals to maintain backward compatibility), and the default processing is to terminate the process and generate a core file.
Workaround
In the case of integer division, determine if dividend is 0.
Bug Comments
The bug that the integer is removed by 0 is easily overlooked by the developer, because it is often difficult to get a dividend of 0 in the development environment, but in the production environment, large user volume and complex user input, it is easy to cause the divisor to be 0.
5. Formatted output parameter error
code example
//格式化参数错误,可能会导致非法的内存访问,从而造成宕机char text[200];snprintf(text,200,"Valid %u, Invalid %u %s", 1);//format格式不匹配
Cause analysis
Formatting parameter errors are similar to wild pointers, but only the memory of the invalid address is read, without memory corruption, so the result is either printing out the garbled data or accessing the memory without read and write access (receiving the segment error signal SIGSEGV) and immediately going down.
Workaround
When writing output formats and parameters, the number and type of parameters must be consistent with the output format.
Add-wformat to GCC's compilation options to allow GCC to detect such errors at compile time.
6. Buffer Overflow
code example
char szBuffer[10];//由于函数栈是从高地址往低地址创建,而sprintf是从低地址往高地址打印字符,//如果超出了缓冲区的大小,函数的栈帧会被破坏,在函数返回时会跳转到未知的地址上,//基本上都会造成访问异常,从而产生SIGABRT或SIGSEGV,造成Crashsprintf(szBuffer, "Stack Buffer Overrun!111111111111111" "111111111111111111111");
Cause analysis
By writing content that exceeds its length to the program's buffer, the buffer overflows, destroying the stack of the function call and modifying the return address of the function call. If the hacker does not intentionally attack, then the final function call will likely jump to the memory area that cannot read and write, produce the segment error signal SIGSEGV or SIGABRT, cause the program to crash, and generate a core file.
Workaround
Check all vulnerable library calls, such as sprintf,strcpy, which do not check the length of the input parameters.
Use a library call with a length check, such as snprintf instead of sprintf, or encapsulate a function with a length check on your sprintf.
At GCC compile time, with the optimization behavior above-o1, compile with-d_fortify_source=level (where level=1 or 2,level represents the difference in detection level, the larger the value, the more stringent). This will cause the GCC to report a buffer overflow error at compile time.
The-fstack-protector or-fstack-protector-all option is added to the GCC compilation to make the stack protection (Stack-smashingprotector, SSP) feature effective. This feature inserts the code for stack detection in the compiled assembly code and detects stack corruption at run time and outputs the report.
Bug Comments
Buffer overflow is a very common and dangerous vulnerability, which exists widely in various operating systems and application software. When the hacker is attacking, the input string usually does not cause the program to crash, but instead modifies the function's return address, causing the program to jump somewhere else and execute the hacker's arranged instructions to achieve the purpose of the attack.
After the buffer overflow, debug the generated core, you can see the call stack is confusing, because the function's return address has been modified to a random address up.
After the server goes down, if the core file and executable file are matched, but the call stack is out of place, then a buffer overflow is a big possibility.
7, the initiative throws the exception
code example
if ((*env)->ExceptionOccurred(env) != 0) { //动态库在内部运行出现错误时,大都会主动abort,终止运行 abort(); //给当前进程发送信号SIGABRT }
Workaround
View the stack to find out why abort
Bug Comments
If the program is active abort, through the stack plus the source is still well positioned, but often the location of abort is in the system library, it is not good positioning, need to view the use of the system API, check whether the use of improper.
Four, a small series have words
Java exception has made everyone's head, native abnormal is more scary, the number is much more than Java exception, just look at the stack is not good positioning (draw small circle curse the evil Hands). Thank you very much for Wang Yan. The original children's shoes can be summed up in this valuable article in the crash of daily development experience!
android--What is Android for C + + Nativecrash