Windows Client Crash Analysis and debugging

Source: Internet
Author: User

This article describes some of the means of crash Analysis on Windows, by way of multi-process debugging, deadlock, and so on.

1. Crash Analysis Process
1.1 Confirm error code
Whether using WinDbg or VS, the first thing to note is the error code, and more than 90% of the crashes are illegal access.
In the case of illegal access, you can look at the destination address of the visit. The address is 0, or close to 0 (0x00000008 or 0XFFFFFFFC),
General and Null pointers are relevant. If it is a seemingly normal address, it is generally the object has been refactored to access its data, or heap corruption.

1.2 Confirm the C + + operation for the crash
What is the C + + operation that confirms the crash:
For example, illegal access, usually need to have a MOV instruction to trigger memory access, and then cause a crash. And the Mov pointer corresponds to which step of C + +?
such as A->b->c->foo ();
When you see the source code, it will be positioned in this line, but it is not clear which step the access failed. So this time to see the corresponding assembly code.
There will probably be several MOV, simple analysis will know which step when the access failed.

Impact on Encoding:
This requires that you do not write anything too complex in a single statement, such as
X? B[i]: y > 0? C->MEMBER[8]: *ptr;
Such code crashes and is difficult to restore to the wrong place.

virtual function Call:
Usually
mov edx, DWORD ptr [ECX]
mov edx, DWORD ptr [edx+0x??]
Call edx
means a virtual function call, where each line is likely to be a crash position (vs will mark the location of the next statement when it crashes in the message).
Crashing in the first line means getting an illegal pointer, either empty or pointing to an illegal address.
A crash in the second line means that the object has been refactored, the ecx point can be accessed, but the value is incorrect, so the virtual function table is incorrect.
In the last line crashes there is usually a crash stack, but the stack frame is not visible, and the corresponding stack frames in VS show an address, no other content.
It also generally means object destruction.

Objects and Destructors:
Destructors are places where crashes occur frequently, and if there is no user-provided destructor, a few lines of assembly are located. So it's okay, just write a
destructors, at least to be able to locate a destructor.
There is also a lot of assembly code outside the destructor, which is the code for the object member's destruction. When it crashes inside, it's hard to confirm
Which object is being refactored.
If you use pointers, call release or delete in the destructor, so that you can display the call without guessing who is in the destructor, of course
With pointers or value objects, there is a logical benefit, not a table.
If the crash location is the location of call or JMP to a a::~a (), you can guess that the type of the destructor object is a.
The object destructor sequence is from back to front, from subclass to base class, according to this, combined with the crash position, you can guess who destructors.
The use of object layouts, such as the migration of members to objects, may have the potential to guess who has a problem with the destructor.
You can artificially introduce some populated bytes into the object layout, making it possible to see the This object (the data on the line crashes without the heap, because
Interception of Fulldump and reported operational difficulties, so this point on the heap may not be seen, and on the stack is possible, is conducive to analysis.

To restore the context:
The amount of dump information on the line is small and cannot be debugged. So can be based on the crash module, crash in the offset in the module, local
Debug the corresponding bin, find the corresponding module, offset, hit the breakpoint, you can restore the local execution environment when the crash. Of course
The local execution to the corresponding location does not necessarily crash, but with more context information, it is easier to determine the corresponding C + + operation.

Using Ida:
You can use IDA to make the assembly code look better and to analyze the process more easily.

To close ALSR, specify the recommended module load address:
This may make it easier to analyze. However, it reduces security and can be used for small traffic versions.

ln command:
The ln command in WinDbg can restore the corresponding information based on the address, for example, the address is in a method of a class. Sometimes it may
To restore a few messages: A::foo () + 0x??, b::foo1 () + 0x???, this requires that you judge by context.

Code optimization:
Code optimization makes analysis more difficult, you can try to change some compilation options, reduce the level of optimization, preserve the stack frame, turn off the application global optimization,
Makes the analysis easier under release.

What you see is not true:
WinDbg and vs See the stack frame may be false: some may be in the middle of some may have been disorderly, perhaps the stack is omitted to make the results of the VS analysis is incorrect.
(Usually the VS analysis is not right, also has the WinDbg Cup with the time)
For the stack that has been messed up in the middle, you can re-restore the stack based on the return address, stack parameters, stack frame omitted data, etc. But in 90% of the situation
, even if it is restored, I do not know what to do next.

Object Restore:
There are no heaps of online crashes, you can copy the objects of interest to the stack (you have to control the deep copy yourself), and then crash the report to see
The state of the object. (Note that code optimization may invalidate the copy)

The logic on the 1.3c++
After determining the relationship between crashes and C + + operations, this is a logical problem, and basically the problem is the object lifecycle management
Improper, which in turn leads to illegal access.
The empty pointer can evade an illegal access, but can further spread the error. The pointer is empty, and is used and cherished.

When designing or coding, you should consider the debugging of the code, such as in the thread pool in chromium, when adding a task, generates the current call
The information, and task bindings, are used to locate the error.

1.4 Heap Damage
Basically no solution, the crash scene and the introduction of the wrong point is too far. Can only do the personnel, listen to the destiny.
For example, open a page heap, there is a certain probability that crashes appear, look at character.
For example, change a CRT heap, or write one yourself, to enhance error detection.
For example, the CRT itself, especially the debug heap, is filled with information on the heap, making it more or less sigh when seen: probably know these
Fill in the information, want more information, difficult ah ...
For example, you can write a debugger yourself, insert a page heap yourself, or use the system's page heap to automate detection and then pass large-scale data
Make it reappear.

2. Other
Multi-Process debugging:
You can pass the
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options
Establish the item of the process name of concern, fill in the debugger key value, the value is the debugger path, make the process create attach. (GFlags is also
Change here). But the problem is that some modules are loaded on demand, and at this point the breakpoint cannot be in the corresponding module.
In addition, you can add the code such as MessageBox or Atlassert in your own place of concern, and then attach to the corresponding process when the dialog box pops up.

Activex:
The Attach method is the same. However, IE's multi-process model makes attach inconvenient.
In IE9 and above, its process model is a master process, controlling multiple tab processes, creating tab processes according to certain rules, assigning tasks to
tab. The same page opens two times, may be assigned to a different process, or it may be the same process. In the same process, the same
There may be multiple instances of an ActiveX, and the main thread for each instance is not necessarily the same one.
Generally controls only one tab, making debugging easier. You can also open multiple tab, then close, and then open, to test the same process
There are instances of multiple out of the ActiveX instance. Further, you can call IWebBrowser2 yourself to simulate more situations.

NP plugin:
This is simple in chrome. A plugin for one process, multiple instances, shared main thread.
There are also open source tools that adapt ActiveX to NP plug-ins, making it possible to invoke AX and debug in Chrome.

Multi-Machine Commissioning:
As I said before, Windbg,vs all support.

Deadlock:
The deadlock scene is not an online problem (it can be reported by means of a deadlock on-line, but it's basically useless, the corresponding
means to play over online). Offline problems are usually on the spot, or you can get full dump. Generally use WinDbg to see, with ~*KB or
This series of commands look at what the threads are doing. The focus is on calls such as WaitForSingleObjectEx. By parsing the call
The corresponding parameter, which can be further restored, who is the thread that is not returned after getting the dispenser object. Or you can!runaway find CPU-consuming
High thread, and then see what the thread is doing. Thread loops wait, then deadlock. The thread has been running there, probably a dead loop.

Offline deadlock detection, can generally be Cheng to the main line of a message to achieve.

Warning: It is possible for a thread to get the dispatcher object, but the thread is already dead.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.