What makes my process crash?

Source: Internet
Author: User
Original article address: What on earth caused my process to crash?
Release date: Monday, November 28,200 5 AM
Prepared by: Tess

 

You can see in the event handler that w3wp.exe unexpectedly stopped 1000 times, or your process unexpectedly exited in an undefined way, but you don't know why.

When a Process crashes or exits, a special event is triggered. This event is called "Exit Process ). For this reason, the helper, such as windbg.exe, can attach it to the process, wait for the EPR to be thrown an exception, and make a memory dump. After installing the debugging tool in windows, you will get vbscript called adplus (http://support.microsoft.com/default.aspx? Scid = kb; en-us; 286350), which runs automatically and prints logs of most exceptions occurring throughout the life of the process.

Debugging skills! : Open a dump in crash mode. When the program crashes, it automatically locates the crash location on the active thread. If you switch the thread to the wrong thread, enter ~ List all threads. a breakpoint marks a problematic thread.

If dump only shows that there is one active thread in the process and the thread is the main thread, the process may be killed due to external factors, such as health Monitor (health monitoring) and low system memory, restart IIS and so on.

In subsequent posts, I will further discuss some scenarios. I am a person from the beginning and end, so I want to start this post in some common situations. When you see a managed process exit, you will be aware of the points in the dump file.

This is the most common scenario in support and is not arranged in special order:


Stack Overflow exception

When the stack memory allocated by a thread is exhausted, a stack overflow occurs. 1 MB is allocated by default, so the stack call can be adjusted quite deeply. In most cases, stack overflow is caused by infinite recursion. For example, function A calls function B and function B calls function A... and is infinite without any conditions for stopping.

The usage of exception handling application blocks is inappropriate, which is a common implicit infinite recursion scenario. Imagine this situation: an exception occurs in the application, exception handling traces this exception, and creates a log file. This type of exception (LOGIN prohibited) is encountered during login, And you will use exception handling to handle it. In this case, an infinite recursive loop is generated when processing this exception. It throws another exception and throws another exception when processing this exception... you should understand it now. Here, we would like to explain that you should not use exception handling to handle exceptions in previous exception handling statements :)

Run "kb 2000 (see local stack) and "! Clrstack (from sos. dll, you can view managed stacks) to track where recursion occurs and why it occurs.


Memory overflow exception

In most cases, memory overflow is caused by design problems. Too much memory is stored in the cache or Session. If used properly, the cache can greatly improve the performance. For example, you can set the expiration time for frequently used data caches as needed. Believe me, in the old asp technology, storing objects in sessions may sometimes cause problems. Developers should only store the most necessary things into the Session. However, for example, storing a large dataset in a Session is harmful to application performance because it reduces the number of concurrent users that the website can process. When the memory usage is high enough, the first thing to do is to take the time to recycle junk objects. By using the cache to search for data, you can avoid getting the required data from the database.

Whether to store data in the Session/cache. This is not a fixed method suitable for all situations. The best way is to make an early assessment to determine the number of users in the application and determine the amount of storage that each user allows based on this analysis. Then perform a stress test based on the maximum number of users to ensure there is no problem. Perform stress tests on the stored and non-stored objects in the session to see which one is better. The results of different users are different.

Memory problems in software products are very difficult to fix, because they often need to be re-designed. Therefore, a plan ahead will save a lot of work in the future.

Testing skills! : Run it first! Dumpheap-type System. Web. Caching. Cache get the root cause of the cache and use it on the corresponding address! Objsize to check how many resources are stored in the cache. (Note: The InProc Session mode is also stored in the Cache)

For more details about the cause of memory overflow exceptions, refer to my previous post.


Exceptions not handled by COM components

If the application calls the local COM component, the application will crash due to an unhandled exception in the COM component. For example, a memory is referenced, but it has been released.


Local Heap Corruption
 

This is the most annoying issue with GC vulnerabilities. Local heap corruption occurs when writing data to an unspecified address. The error message is not displayed when executing the Code with the wrong address. However, the developer does not know that the memory address to be written is incorrect. In other words, there are already thieves there. The error may be written in a heap, but worse, it may be written in a place where the code instruction is stored. Therefore, the previous instruction will be overwritten, so that the Code cannot be called until it is executed. This occurs most frequently when writing to the boundary of the buffer (or other similar storage areas.
 
Read the Geoff Gray's article about heap corruption. If a heap crashes, the ntdll heap usually calls the function. You need to run GFlags or PageHeap together to seize the "thief" to solve the problem. However, it is difficult to capture the case where an address error is written, because the occurrence time is arbitrary and difficult to reproduce.


Managed Heap Corruption

Managed heap corruption is a kind of heap corruption that occurs on the managed heap. This problem is hard to be captured. When a memory block that cannot be written is attached to the managed heap, the managed heap is damaged. Generally, no buffer overflow occurs in managed code. If a byte [] array is assigned a value that exceeds the boundary, an IndexOutOfRange exception occurs. One of the most common causes of heap hosting corruption is that a function code called PInvoke passes in a buffer that sorts by certain conditions, but the buffer capacity is too small. When the PInvoked function writes data to the buffer zone, beyond the boundary of the buffer, it is written to the next object of the managed heap. The garbage collector then works and tries to pass through the managed heap. Then the process will crash.

If a crash occurs on an active stack containing the Garbage Collector function, you should query the PInvokes function in the code to check whether the traversal occurs because the buffer zone is too small.


Fatal execution engine exception

Fatal execution engine exceptions are rare. A fatal execution engine exception is usually a bug. This means that for some reason, code execution enters some unexpected code segments in CLR. CLR throws a fatal execution engine exception and crashes because it cannot be recovered from the breakpoint. It will be recorded in the event log as a fatal execution engine exception, and the addresses listed will be the correct addresses for Crash. If a fatal execution engine exception occurs, you cannot find the relevant technical documents and contact the technical support staff. It is recommended to attach a crashed dump so that the technical support staff can solve the problem easily.


GC Vulnerability

This is also very rare. The unmanaged part of CLR has a pointer pointing to the managed code, but it "forgets" to tell the garbage collector about the information. Therefore, the garbage collector does not know how to save the scene or how to track commands. The above means that the garbage collection and cleaning time is incorrect. At this time, the pointer can point to any place and cause a lot of damage. Yun Jin has a little discussion here: http://blogs.msdn.com/yunjin/archive/2004/02/08/69906.aspx.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.