Unlock the code deadlock in the critical section of Windows [from msdn]

Last Update:2018-12-05 Source: Internet

Author: User

Tags knowledge base

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Unlock the code deadlock in the critical section of Windows released on: 1/13/2005 | updated on: 1/13/2005

Matt pietrek and Russ osterlund

This document assumes that you are familiar with Win32, C ++, and multithreading.

Download the code in this article: criticalsections.exe (0000kb)

Summary

The critical section is a mechanism to prevent multiple threads from executing a specific code section at the same time. This topic does not attract much attention, so people cannot fully understand it. It is useful to have a deep understanding of the critical section in Windows when you need to track the performance of multi-thread processing in the code. This article studies the principle of critical section in depth to reveal useful information in the process of finding deadlocks and verifying performance problems. It also contains a convenient utility that can display all critical sections and their current status.

In our many years of programming practices, it is strange that the Win32 critical section has not received a lot of "under the hood" attention. Of course, you may have some basic knowledge about the initialization and use of the critical section, but have you ever spent time researching the critical_section structure defined in winnt. h? Some very meaningful good things in this structure are ignored for a long time. We will supplement this and introduce you to some meaningful skills that are useful for tracking imperceptible multi-threaded error handling. More importantly, we can use our mycriticalsections utility to understand how to scale critical_section to provide very useful features, these features can be used for debugging and Performance Tuning (to download the complete code, see the link at the top of this Article ).

To be honest, the authors often ignore the critical_section structure because its implementation in the following two major Win32 code libraries is very different: Microsoft Windows 95 and Windows NT. People know that these two code libraries have developed a large number of subsequent versions (the latest versions are Windows ME and Windows XP respectively), but there is no need to list them one by one here. The key lies in that Windows XP is now well developed, and developers may soon stop supporting Windows 95 operating systems. This is what we do in this article.

It is true that Microsoft. NET Framework is currently the most popular, but good legacy Win32 programming will not soon disappear. If you have existing Win32 code that uses the critical section, you will find that our tools and descriptions of the critical section are very useful. However, it is important to note that we only discuss Windows NT and its subsequent versions without any content related to. net.

Critical section: Brief Introduction

If you are very familiar with the critical section and can apply it without thinking about it, you can skip this section. Otherwise, read down to quickly review the content. If you are not familiar with the basic content, the content after this section does not make much sense.

A critical section is a lightweight mechanism that allows only one thread to execute a given code segment at a time. A critical section is usually used to modify global data (such as collection classes. Events, multi-user terminal execution programs, and semaphores are also used for multi-thread synchronization. However, unlike the critical section, they do not always execute control conversions to the kernel mode, which is expensive. As you can see later, to obtain a non-occupying critical section, you only need to make few modifications to the memory, which is very fast. It will only jump to kernel mode when trying to obtain an occupied critical section. The disadvantage of this lightweight feature is that the critical section can only be used to synchronize threads in the same process.

The critical section is represented by the rtl_critical_section structure defined in winnt. h. Because your c ++ Code usually declares a variable of the critical_section type, you may not understand this. After studying WINBASE. H, you will find that:

typedef RTL_CRITICAL_SECTION CRITICAL_SECTION;

We will reveal the essence of the rtl_critical_section structure in a short time. At this point, the important problem is that critical_section (also called rtl_critical_section) is only a structure with easy-to-access fields. These fields can be operated by the Kernel32 API.

When the critical section is passed to initializecriticalsection (or, more accurately, when its address is passed), the critical section begins to exist. After initialization, the Code passes the critical section to entercriticalsection and leavecriticalsection APIs. After a thread is returned from entercriticalsection, all other threads that call entercriticalsection will be blocked until the first thread calls leavecriticalsection. Finally, when this critical section is no longer needed, a good coding habit is to pass it to deletecriticalsection.

In the ideal situation where the critical section is not used, the call to entercriticalsection is very fast, because it only reads and modifies the memory location in the user mode memory. Otherwise (an exception will be encountered later), the thread in the critical section is prevented from effectively completing this task without consuming additional CPU cycles. The blocked threads wait in the inner-core mode. These threads cannot be scheduled before the owner of the critical section releases them. If multiple threads are blocked in one critical section, when the other thread releases the critical section, only one thread obtains the critical section.

In-depth study: rtl_critical_section Structure

Even if you have used the critical section in your daily work, it is very likely that you do not really know anything beyond the document. In fact, there are a lot of content that is very easy to grasp. For example, it is seldom known that the critical section of a process is stored in a linked list and can be enumerated. In fact, windbg supports it! Locks command, which can list all the critical partitions in the target process. The utility we will talk about later also applies the little-known feature of the critical section. To truly understand how the tool works, it is necessary to grasp the internal structure of the critical section. With this in mind, we will study the structure of rtl_critical_section. For convenience, the structure is listed as follows:

struct RTL_CRITICAL_SECTION{    PRTL_CRITICAL_SECTION_DEBUG DebugInfo;    LONG LockCount;    LONG RecursionCount;    HANDLE OwningThread;    HANDLE LockSemaphore;    ULONG_PTR SpinCount;};

Each field is described in the following sections.

DebuginfoThis field contains a pointer pointing to the companion structure allocated by the system. The type of this field is rtl_critical_section_debug. This structure contains more valuable information and is also defined in winnt. h. We will conduct further research on it later.

LockcountThis is the most important field in the critical section. It is initialized to a value of-1. If the value is equal to or greater than 0, this critical section is occupied. When it is not equal to-1, the owningthread field (This field is incorrectly defined in winnt. H-it should be DWORD rather than handle) contains the thread ID that owns this critical section. The difference between this field and the value (recursioncount-1) indicates how many other threads are waiting to obtain the critical section.

RecursioncountThis field contains the number of times that the owner thread has obtained the critical section. If the value is zero, the next thread that attempts to obtain the critical section will succeed.

OwningthreadThis field contains the thread identifier of the thread currently occupying this critical section. The thread ID is the same as the ID returned by APIs such as getcurrentthreadid.

LocksemaphoreThe name of this field is incorrect. It is actually an auto-Reset event, not a signal. It is a kernel object handle used to notify the operating system that the critical section is currently idle. The operating system automatically creates a handle when a thread attempts to obtain the critical section for the first time but is blocked by another thread that already has the critical section. Deletecriticalsection should be called (it will issue a closehandle call that calls the event and release the debug structure if necessary), otherwise the resource will leak.

SpincountUsed only for multi-processor systems. In the msdn document, this field is described as follows: "In a multi-processor system, if the critical section is unavailable, the calling thread will wait for the signal related to the critical section, rotate dwspincount times. If the critical section becomes available during the rotation operation, the call thread avoids waiting for the operation ." Rotating counting can provide better performance on a multi-processor computer because rotating in a loop is usually faster than waiting in kernel mode. The default value of this field is zero, but you can use initializecriticalsectionandspincount API to set it to a different value.

Rtl_critical_section_debug Structure

We have noticed that in the rtl_critical_section structure, the debuginfo field points to an rtl_critical_section_debug structure. The structure is as follows:

struct _RTL_CRITICAL_SECTION_DEBUG{    WORD   Type;    WORD   CreatorBackTraceIndex;    RTL_CRITICAL_SECTION *CriticalSection;    LIST_ENTRY ProcessLocksList;    DWORD EntryCount;    DWORD ContentionCount;    DWORD Spare[ 2 ];}

This structure is allocated and initialized by initializecriticalsection. It can be allocated either by the pre-allocated array in Ntdll or by the process heap. This adjoint structure of rtl_critical_section contains a set of matching fields with different roles: two of them are hard to understand, and the other two provide the key to understanding the structure of this critical blockchain, the two are set repeatedly, and the last two are not used.

The following describes the rtl_critical_section field.

TypeThis field is not used and is initialized to a value of 0.

CreatorbacktraceindexThis field is only used for diagnosis. The keyfield, globalflag, and stacktracedatabasesizeinmb values under the Registry item HKLM/software/Microsoft/Windows NT/CurrentVersion/Image File Execution options/yourprogram. Note that these values are only displayed when you run the gflags command described later. When these registry values are correctly set, the creatorbacktraceindex field is filled with an index value used in the stack trace. Search for the phrase "create user mode stack trace Database" and "enlarging the user-mode stack trace Database" in the gflags document in msdn to find more information about the content.

CriticalsectionPoint to the rtl_critical_section related to this structure.Figure 1Describes the basic structure and the relationship between rtl_critical_section, rtl_critical_section_debug, and other participants in the event chain.

Figure 1 process of critical section

ProcesslockslistList_entry is a standard Windows data structure used to represent nodes in a two-way linked list. Rtl_critical_section_debug contains a part of the linked list, allowing you to traverse the critical section forward and backward. The utility provided later in this article describes how to use the flink (forward link) and blink (backward link) fields to move between Members in the linked list. Anyone who has worked on device drivers or has studied the Windows Kernel will be very familiar with this data structure.

Entrycount/contentioncountThese fields are incremented at the same time for the same reason. This is the number of threads that enter the waiting state because the critical zone cannot be obtained immediately. Unlike the lockcount and recursioncount fields, these fields will never decrease.

SparesThese two fields are not used or even initialized (although these fields are cleared when the critical section structure is deleted ). It will be explained later that you can use these unused fields to save useful diagnostic values.

Even if rtl_critical_section_debug contains multiple fields, it is also a necessary component of the regular critical section structure. In fact, if the system happens to be unable to obtain this structure from the process heap, initializecriticalsection will return the lasterror result of status_no_memory, and then return the incomplete critical zone structure.

Critical Zone status

When the program runs, enters, and leaves the critical section, fields in the rtl_critical_section and rtl_critical_section_debug structure change according to the status of the critical section. These fields are updated by the bookkeeping code in the critical section API and will be seen later. If the program is multi-threaded and Its thread access is a public resource protected by the critical section, these States are more meaningful.

However, no matter how the code thread is used, there are two States. In the first case, if the lockcount field has a value not equal to-1, this critical section is occupied, and the owningthread field contains the thread identifier of the thread that owns this critical section. In a multi-threaded program, the combination of lockcount and recursioncount indicates how many threads are blocked in this critical section. In the second case, if recursioncount is a value greater than 1, it tells you how many times the owner thread has re-obtained the critical section (maybe unnecessary ), you can call entercriticalsection or tryentercriticalsection to obtain the critical section. Any value greater than 1 indicates that the Code may be less efficient or may cause errors in the future. For example, any c ++ class method accessing public resources may re-enter the critical section without any need.

Note: In most cases, the lockcount and recursioncount fields contain their initial values-1 and 0, which are very important. In fact, for a single-threaded program, you cannot only check these fields to determine whether a critical section has been obtained. However, multithreading programs leave some marks to determine whether two or more threads attempt to have the same critical section at the same time.

One of the tags you can find is that the locksemaphore field contains a non-zero value even when the critical section is not occupied. This indicates that at a time, one or more threads are blocked in this critical section-the event handle is used to notify that the critical section has been released, one of the threads waiting for the critical section can now obtain the critical section and continue execution. Because the OS automatically allocates event handles when blocking another thread in the critical section, if you forget to delete the event handle when you no longer need the critical section, the locksemaphore field may cause resource leakage in the program.

Another State that may be encountered in a multithreaded program is that the entrycount and contentioncount fields contain a value greater than zero. The two fields store the number of times a thread is blocked in the critical section. Each time this event occurs, the two fields are incremented, but will not be decreased during the existence of the critical section. These fields can be used to indirectly determine the execution path and features of the program. For example, when entrycount is very high, it means that the critical section has experienced a lot of contention and may become a potential bottleneck in code execution.

When studying a deadlock program, we will also find a State that does not seem to be capable of logical interpretation. The lockcount field of a frequently used critical section contains a value greater than-1, that is, it is owned by the thread, however, the owningthread field is zero (so you cannot find out which thread causes the problem ). The testing program is multi-threaded, which can be used in single-processor computers and multi-processor computers. Although lockcount and other values are different at each run, this program always deadlocks in the same critical section. We would like to know if any other developer has encountered an API call sequence that causes this status.

Build a better mouse trap

When we learn how to work in the critical section, we occasionally get some important discoveries. Using these discoveries, we can get a very good practical tool. The first discovery is the appearance of the processlockslist list_entry field, which reminds us that the critical section of the process may be enumerable. Another major discovery is that we know how to find the header of the critical section list. Another important finding is that you can write the spare field of rtl_critical_section without any loss (at least in all our tests ). We also found that some critical section routines of the system can be easily rewritten without modifying the source file.

Initially, we started with a simple program that checks all the critical sections in a process and lists their current states to see if they are available. If yes, find out which thread owns and how many threads are blocked in the critical section? This method is suitable for OS enthusiasts, but it is not very useful for programmers who just want to understand their programs.

Even the "Hello World" program in the simplest Console mode has many critical sections. Most of them are created by system DLL such as USER32 or GDI32, which rarely lead to deadlocks or performance problems. We hope there is a way to filter out these critical sections, leaving only those critical sections of the Code concerned. The spare field in the rtl_critical_section_debug structure can accomplish this well. One or two of them can be used to indicate that these critical sections are from user-written code, not from OS.

As a result, the next logical question is how to determine which critical sections are from your written code. Some readers may still remember libctiny. Lib in the under the hood column of Matt pietrek in December January 2001. One technique used by libctiny is a lib file, which overrides the standard implementation of key visual c ++ runtime routines. Place the libctiny. Lib file before other lib of the linker line. The linker will use this implementation instead of using a later version with the same name in the import library provided by Microsoft.

To apply similar techniques to the critical section, we create an alternative version of initializecriticalsection and its related import and export libraries. Place the Lib file before kernel32.lib. The linker links to our version instead of the version in Kernel32. The implementation of initializecriticalsection is shown in figure 2. This code is very simple in concept. It first calls the actual initializecriticalsection in kernel32.dll. Next, it obtains the Code address for calling initializecriticalsection and posts it to one of the standby fields in the rtl_critical_section_debug structure. How can we determine the address of the called code? The x86 Call Command places the return address in the stack. The criticalsectionhelper code knows that the returned address is located at a known fixed position in the stack frame.

The actual result is: Any EXE or DLL correctly linked to criticalsectionhelper. Lib will be imported into our DLL (criticalsectionhelper. dll) and will occupy the critical section where the standby field is applied. This makes things much easier. Now, our utility can simply traverse all the critical sections in the process and display only the critical section information with the correct reserved fields. So what is the price for this utility? Please wait for more information!

Because all your critical sections now contain the addresses when initializing them, the utility can identify each critical section by providing their initialization addresses. The original Code address itself is not so useful. Fortunately, dbghelp. dll makes it easy to convert code addresses to source files, row numbers, and function names. Even if you do not have a signature in a critical section, you can submit the address to dbghelp. dll. If you declare it as a global variable, and if the symbol is available, you can determine the name of the critical section in the original source code. By the way, if dbghelp is used by setting the _ nt_symbol_path environment variable and dbghelp to use its symbol Server Download function, dbghelp will be able to play its role.

Mycriticalsections Utility

We combined all these ideas and proposed the mycriticalsections program. Mycriticalsections is a command line program. You can see some options when running the program without parameters:

Syntax: MyCriticalSections <PID> [options]        Options:        /a = all critical sections        /e = show only entered critical sections        /v = verbose

The only required parameter is the program id or PID (in decimal format ). PID can be obtained in multiple ways, but the simplest method may be through task manager. Without other options, mycriticalsections lists the status of all critical sections from the code module. You have linked criticalsectionhelper. DLL to these code modules. If there is a symbol that can be used for this (some) module, the code will try to provide the name of the critical section and the location where it is initialized.

To view how mycriticalsections works, run the demo. exe program, which is included in the downloaded file. Demo. EXE only initializes two critical zones, and a thread enters these two critical zones. Figure 3 shows the result of running "mycriticalsections 2040" (2040 is the PID of demo. EXE ).

In this figure, two critical zones are listed. In this example, they are named csmain and yetanothercriticalsection. Each address: Line displays the address and name of the critical_section. The "initialized in" line contains the function name in which the critical_section is initialized. The "initialized at" line of the Code displays the source file and the row number in the initialization function.

For the csmain critical section, you will see that the number of locks is 0 and the number of recursion is 1, indicating a critical section that has been obtained by a thread, and no other threads are waiting for this critical section. Because no thread is blocked in this critical section, the entry count field is 0.

Now, we can see that yetanothercriticalsection has three recursion numbers. Quick View of the Demo code shows that the main thread calls entercriticalsection three times, so the occurrence of the event is as expected. However, another second thread tries to obtain the critical section and has been blocked. Similarly, the lockcount field is 3. This output shows a waiting thread.

Mycriticalsections has some options that make it very useful for more brave explorers. The/V Switch displays more information for each critical section. The rotation number and lock signal fields are particularly important. You will often see that NTDLL and other DLL have a non-zero rotation critical section. If a thread is locked during the process of obtaining the critical zone, the locked signal field is a non-zero value. The/V Switch also displays the content of the standby field in the rtl_critical_section_debug structure.

The/a switch displays all critical sections in the process, even if there is no criticalsectionhelper. dll signature. If/A is used, prepare a large number of outputs. Real hackers want to use/a and/V at the same time to display the maximum details of all content in the process. One of the minor advantages of using/A is to see the ldrploaderlock critical section in Ntdll. This critical section is occupied during the dllmain call and other important periods. Ldrploaderlock is one of the reasons for the formation of many apparently obscure deadlocks. (To enable mycriticalsection to correctly mark the ldrploaderlock instance, the PDB file for NTDLL is available .)

/E enables the program to only display the currently occupied critical section. When the/a switch is not used, only the critical section occupied in the Code is displayed (as indicated by the signature in the backup field ). When the/a switch is used, all the critical zones in the process are displayed, regardless of their source.

So when do you want to run mycriticalsections? A very clear time is when the program is deadlocked. Check the occupied critical section to see if anything surprised you. Even if the deadlocked program is running under the control of the debugger, you can use mycriticalsections.

Another way to use mycriticalsections is to adjust the performance of programs with a large number of threads. When blocking a frequently used, non-reentrant function in the debugger, run mycriticalsections to check which critical sections are occupied at this time. If many threads execute the same task, it is very easy to cause a situation where a thread is spent most of its time waiting for a frequently used critical section. If there are multiple frequently used critical zones, the consequences will be the same as the watering hose in the garden. To solve a contention problem, you only need to transfer the problem to the next critical section that is prone to blocking.

A good way to check which critical sections are most likely to cause contention is to set a breakpoint close to the end of the program. When a breakpoint occurs, run mycriticalsections and find the critical section with the maximum entry Count value. These critical sections cause most blocking and thread conversion.

Although mycriticalsections runs on Windows 2000 and later, you still need a newer version of dbghelp. dll-5.1 or later. This version is available in Windows XP. You can also obtain this version from other tools that use dbghelp. For example, debugging tools for Windows usually has the latest dbghelp. dll.

In-depth study of important critical zone routines

The last section is for brave readers who want to understand the inner story of the critical section. After carefully studying NTDLL, you can create a pseudo code for these routines and their support subroutines (see NTDLL (criticalsections). cpp in the download ). The following Kernel32 APIs form public interfaces in the critical section:

InitializeCriticalSectionInitializeCriticalSectionAndSpinCountDeleteCriticalSectionTryEnterCriticalSectionEnterCriticalSectionLeaveCriticalSection

The first two APIs are only thin packages around ntdll api rtlinitializecriticalsection and rtlinitializecriticalsectionandspincount. All remaining routines are submitted to functions in Ntdll. In addition, the call to the rtlinitializecriticalsection is another thin package called around rtlinitializecriticalsectionandspincount, and its rotation value is 0. When using the critical section, you actually use the following ntdll api behind the scenes:

RtlInitializeCriticalSectionAndSpinCountRtlEnterCriticalSectionRtlTryEnterCriticalSectionRtlLeaveCriticalSectionRtlDeleteCriticalSection

In this discussion, we use the Kernel32 name, because most Win32 programmers are more familiar with it.

Initializecriticalsectionandspincount is very easy to initialize the critical section. The field in the rtl_critical_section structure is assigned the start value. Similarly, allocate the rtl_critical_section_debug structure and initialize it. Assign the returned value in the rtllogstackbacktraces call to creatorbacktraceindex and establish a link to the previous critical section.

By the way, creatorbacktraceindex generally receives 0 values. However, if you have gflags and umdh utilities, enter the following command:

Gflags /i MyProgram.exe +ustGflags /i MyProgram.exe /tracedb 24

These commands Add the registry key under "Image File Execution options" of myprogram. The next myprogram execution will show that this field receives a non-zero value. For more information, see the Knowledge Base Article q26834320.umdhtools.exe: How to Use umdh.exe to find memory leaks ". Another issue that needs to be noted during the initialization of the critical section is that the first 64 rtl_critical_section_debug structures are not allocated by the process heap, but an array from the. Data Section in Ntdll.

After the use of the critical section, call deletecriticalsection (which is named improperly because it only deletes rtl_critical_section _ Debug) to traverse a similar understandable path. If an event is created because the thread is blocked when trying to obtain the critical section, zwclose is called to destroy the event. Next, after the protection is obtained through rtlcriticalsectionlock (NTDLL protects its internal critical section list in a critical section-You guessed it), the debugging information is cleared from the chain, update the linked list of the critical section to reflect the clearing operation on the information. The memory is filled with a null value. If the storage zone is obtained from the process heap, calling rtlfreeheap will release the memory. Finally, fill in rtl_critical_section with zero.

There are two APIs to obtain resources protected by the critical section-tryentercriticalsection and entercriticalsection. If a thread needs to enter a critical section, but can execute useful work while the blocked resources become available, tryentercriticalsection is the API you need. This routine tests whether the critical section is available. If the critical section is occupied, the Code returns false, giving the thread the opportunity to continue executing another task. Otherwise, the function is equivalent to entercriticalsection.

If this thread does need to own this resource before continuing, use entercriticalsection. In this case, the spin count test for a multi-processor computer is canceled. This routine is similar to tryentercriticalsection. no matter whether the critical section is idle or already owned by this thread, the bookkeeping of this critical section is adjusted. Note that the most important increase in lockcount is completed by the x86 "Lock" prefix, which is very important. This ensures that only one CPU can modify the lockcount field at a time. (In fact, the Win32 interlockedincrement API is only an add command with the same lock prefix .)

If the call thread cannot obtain the critical section immediately, call rtlpwaitforcriticalsection to put the thread in the waiting state. In a multi-processor system, entercriticalsection rotates the number of times specified by the spincount, and tests the availability of this critical area in each loop access. If the critical section changes to idle during the cycle, the thread obtains the critical section and continues execution.

Rtlpwaitforcriticalsection may be the most complex and important process given here. This is not surprising, because if there is a deadlock and a critical section is involved, using the debugger to enter the process may display at least one thread in the zwwaitforsingleobject call in rtlpwaitforcriticalsection.

As shown in the pseudo code, there is a bit of bookkeeping work in the rtlpwaitforcriticalsection, such as increasing the entrycount and contentioncount fields. But more importantly, it sends out the waiting for locksemaphore and the processing of the waiting result. By default, a null pointer is passed to zwwaitforsingleobject as the third parameter. The request Never times out. If timeout is allowed, a debugging message string is generated and the waiting starts again. If the process cannot be returned successfully from the waiting state, an error will occur to stop the process. Finally, when a successful response is returned from the zwwaitforsingleobject call, the execution will return from the rtlpwaitforcriticalsection. This thread now has this critical section.

A critical condition that must be recognized by rtlpwaitforcriticalsection is that the process is being shut down and is waiting for the loading program to lock (ldrploaderlock) The critical zone. Rtlpwaitforcriticalsection mustNoThis thread is allowed to be blocked, but you must skip this wait and continue to close the thread.

Leavecriticalsection is not as complex as entercriticalsection. If the result is not 0 after decreasing recursioncount (meaning that the thread still has this critical section), the routine will return in error_success status. This is why you need to use the appropriate number of leave calls to balance the enter calls. If the count is 0, the owningthread field is cleared and the lockcount value is decreased. If other threads are waiting, for example, lockcount is greater than or equal to 0, rtlpunwaitcriticalsection is called. This helper routine creates locksemaphore (if it does not exist) and sends this signal to remind the operating system that the thread has released the critical section. As part of the notification, wait for one of the threads to exit the waiting status and prepare for running.

Finally, how does the mycriticalsections program determine the start point of the critical blockchain? If you have the right to access the correct debugging symbol of NTDLL, it is very easy to search and traverse the list. First, locate the rtlcriticalsectionlist symbol, clear its content (it points to the first rtl_critical_section_debug structure), and start traversing. However, not all systems have debugging symbols. The address of the rtlcriticalsectionlist variable varies with the versions of Windows. To provide a solution that works properly for all versions, we have designed the following tentative solutions. Observe the steps taken to start a process and you will see that the critical section in Ntdll is initialized in the following order (these names are taken from the debug symbol of NTDLL ):

RtlCriticalSectionLockDeferedCriticalSection (this is the actual spelling!)LoaderLockFastPebLockRtlpCalloutEntryLockPMCritSectUMLogCritSectRtlpProcessHeapsListLock

Check the address at the offset 0xa0 in the process environment block (peb) to find the loader lock. Therefore, it is easier to locate the start position of the chain. We read debugging information about the loader lock and traverse the two links along the chain to locate the rtlcriticalsectionlock item and obtain the first critical section of the chain. For more information about the methods, seeFigure 4.

Figure 4 initialization sequence

Summary

Almost all multi-threaded programs use the critical section. Sooner or later, you will encounter a critical section that deadlocks the code, and it is difficult to determine how to enter the current state. If you have a better understanding of the working principles of the critical section, this situation will not be As frustrating as it was when it first appeared. You can study a seemingly vague critical section and determine who owns it and other useful details. If you are willing to add our library to your linker line, you can easily obtain a large amount of information about the use of your program's critical section. By using unused fields in the critical section structure, our code can only isolate and name the critical section used by your module, and inform it of its accurate status.

Bold readers can easily expand our code to do more extraordinary work. For example, you can intercept entercriticalsection and leavecriticalsection in a way similar to initializecriticalsection hook to store the last successful acquisition and release of the critical section. Similarly, critsect dll has an easy-to-call API for enumerating critical sections in your own code. Using Windows Forms in. NET Framework, you can easily create a GUI version of mycriticalsections. It is very likely to expand our code. We are very happy to see innovative methods found and created by other people.

For more information, see:
Global flag reference: Create Kernel Mode stack trace Database
Gflags examples: enlarging the user-mode stack trace Database
Under the Hood: reduce EXE and DLL size with libctiny. Lib

Matt pietrekIs a software architect and author. He works at the compuware/numbench lab and is the chief architect of boundschecker and distributed analyzer products. He has created three books on Windows programming and isMsdn magazine. His web site (http://www.wheaty.net) has FAQs and information about previous articles and columns.

Jay HilyardIs a software engineer in the boundschecker Team of the compuware/numbench lab. He, his wife, and their cats are new residents of the new State of New York. His contact information is RussOsterlund@adelphia.net or web site http://www.smidgeonsoft.com.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More