A bug caused by loader lock

Source: Internet
Author: User

In Windows, one way to modularize the program is to implement it as a dynamic link library. Then, load the dynamic link library implicitly or explicitly when the main program starts. However, if the dllmain function of the dynamic link library is improperly compiled, unexpected bugs may occur, such as the typical loader lock deadlock problem. No, we encountered a bug in our product caused by loader lock ....


1. Background

When the main program is started, it implicitly or explicitly loads the dynamic link library, calls the dllmain of the dynamic link library, or when the thread is created, the dllmain of the dynamic link library is implicitly called during thread startup. However, to call dllmain in multiple threads in sequence, Microsoft uses a lock called loader lock when calling dllmain. This lock acts on the entire process.

For example, when loadlibrary is used in the current program to load the dynamic link library for the first time, the sequence of calling the dynamic link library is as follows:

Since there is a hidden loader lock, you need to be very careful when writing the dllmain. For example, a deadlock in segment 2.5 in the winodws core programming book:

BOOL WINAPI DllMain(HINSTANCE hInstDll, DWORD fdwReason, PVOID fImpLoad){HANDLE hThread;DWORD dwThreadId;switch (fdwReason){case DLL_PROCESS_ATTACH:// The DLL is being mapped into the process' address spacehThread = CreateThread(NULL, 0, SomeFuction, NULL, 0, &dwThreadId);WaitForSingleObject(hThread, INFINITE);CloseHandle(hThread);break;case DLL_THREAD_ATTACH:// A thread is being createdbreak;case DLL_THREAD_DETACH:// A thread is exiting cleanlybreak;case DLL_PROCESS_DETACH:// The DLL is being unmapped from the process' address spacebreak;}return TRUE;}
As shown in the preceding example, When dllmain receives the dll_process_attach notification, a new thread is created. The system uses dll_thread_attach to notify the newly created thread to call dllmain again. The previous thread is still waiting for the execution of the newly created thread to end in the dllmain, but since the previous thread occupies the loader lock, the newly created thread is always waiting for the loader lock, resulting in a deadlock. .

2. windbg analysis problems

In the background, I understand that some hidden bugs will occur in loader lock, so be careful when writing dllmain. In actual products, the complexity of the problem must exceed the above example. Below I will simplify the logic of problems in our products:


If the product exists as a Windows Server,. DLL, while. A thread thread2 will be created in the DLL's dllmain. After receiving the log clearing event, the thread clears the log. Load B. DLL, in B. in the DLL dllmain, the log file will be checked. If it is greater than 10 MB, thread2 will be notified to clean up the log and wait for thread2 to finish clearing the log (up to 5 minutes ). However, when the log size is greater than 10 MB, startup timeout may occur when the service is started.

Therefore, use windbg attach to the main hang process, set the symbols of the product, and first check which locks are being occupied:

0:019> !locksCritSec ntdll!LdrpLoaderLock+0 at 0000000077d17490WaiterWoken        NoLockCount          12RecursionCount     1OwningThread       cb0EntryCount         0ContentionCount    d*** Locked

We can see that the lock is occupied by the thread cb0 (hexadecimal), and from the lockcount perspective, there are many threads to request the loader lock. First, according "! Thread "command to obtain the serial number of cb0, which occupies the loader lock thread, is 5 (only six threads are listed below, but there are actually dozens of threads ):

0:019> !threadsIndexTIDTEBStackBaseStackLimitDeAllocStackSizeThreadProc00000000000000d4c0x000007fffffdd0000x00000000001300000x00000000001260000x00000000000300000x000000000000a0000x010000000000000fc00x000007fffffdb0000x00000000024900000x000000000248e0000x00000000023900000x00000000000020000x0200000000000009680x000007fffffae0000x0000000002cc00000x0000000002cbe0000x0000000002bc00000x00000000000020000x0300000000000009140x000007fffffac0000x0000000002dc00000x0000000002dbe0000x0000000002cc00000x00000000000020000x040000000000000de40x000007fffffaa0000x0000000002ec00000x0000000002ebc0000x0000000002dc00000x00000000000040000x050000000000000cb00x000007fffffa80000x0000000002fc00000x0000000002f9a0000x0000000002ec00000x00000000000260000x0
Then, check the function call stack of the thread cb0, whose hang is in the db_xxxxxxxxx function of the xmodule3 module. This function is previously mentioned and notifies the log thread for cleanup, and wait until the cleanup is completed (up to 5 minutes). This thread is waiting.
0:019> ~5kvChild-SP          RetAddr           : Args to Child                                                           : Call Site00000000`02fbd558 000007fe`fdd81203 : 00000000`02fbd618 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!NtDelayExecution+0xa00000000`02fbd560 00000000`63151a35 : 00000000`00000008 00000000`00000000 00000000`00000000 00000000`00000000 : KERNELBASE!SleepEx+0xab00000000`02fbd600 00000000`6327299d : 00000000`00000000 00000000`00000000 00000000`00000010 00000000`002e2770 : xmodule3!DB_xxxxxxxxx+0x10500000000`02fbd650 00000000`007fab85 : 00000000`00000001 00000000`00000004 00000000`00000268 00000000`02fbe3a8 : xmodule2!LM_xxxxx+0x18d00000000`02fbe3e0 00000000`0082848d : 00000000`00000001 00000000`00000001 00000000`00000000 000012eb`e9b70b34 : xmodule1!ENG_xx+0x60500000000`02fbee10 00000000`77c1b108 : 00000000`002cbb00 00000000`00000000 00000000`00000000 00000000`00297bf4 : xmodule1!ENG_xxx+0x2065d00000000`02fbee50 00000000`77c0787a : 00000000`00000000 00000000`002cbb00 00000000`02fbef60 00000000`00000000 : ntdll!LdrpRunInitializeRoutines+0x1fe00000000`02fbf020 00000000`77c07b5e : 00000000`00000000 00000000`0012fc38 00000000`02fbf2c0 000007fe`fdd8da2d : ntdll!LdrpLoadDll+0x23100000000`02fbf230 000007fe`fdd89059 : 00000000`00000000 00000000`00000000 00000000`0012fc38 00000000`00000046 : ntdll!LdrLoadDll+0x9a00000000`02fbf2a0 00000001`40003b05 : 00000000`00000000 00000000`0012fc38 00000001`4000e3d8 00000000`00000000 : KERNELBASE!LoadLibraryExW+0x22e00000000`02fbf310 00000000`757237d7 : 00000000`0096d840 00000000`0096d840 00000000`00000000 00000000`00000000 : SpntSvc+0x3b0500000000`02fbff00 00000000`75723894 : 00000000`757d95c0 00000000`0096d840 00000000`00000000 00000000`00000000 : MSVCR80!endthreadex+0x4700000000`02fbff30 00000000`779d652d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : MSVCR80!endthreadex+0x10400000000`02fbff60 00000000`77c0c541 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd00000000`02fbff90 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d
From the above we can see that the thread cb0 has been waiting for the log thread to be cleared. What happened to the log cleanup thread? First, I recorded in the log that the handle of the log clearing thread is "17c" (hexadecimal ). Check that the thread ID is 5fc. 890.

0:019> !handle 17c fHandle 17c  Type         Thread  Attributes   0  GrantedAccess0x1fffff:         Delete,ReadControl,WriteDac,WriteOwner,Synch         Terminate,Suspend,Alert,GetContext,SetContext,SetInfo,QueryInfo,SetToken,Impersonate,DirectImpersonate  HandleCount  4  PointerCount 6  Name         <none>  Object Specific Information    Thread Id   5fc.890    Priority    10    Base Priority 0    Start Address 75723810 MSVCR80!endthreadex

View the function stack of the thread for clearing logs in the same way as before, in "NTDLL! The parameter "00000000 '77d17490" in rtlpwaitoncriticalsection "happens to be loader lock. Finally, I really want to make it clear ~~~
0:019> ~6kvChild-SP          RetAddr           : Args to Child                                                           : Call Site00000000`0321f858 00000000`77c2e518 : 00000000`00000000 00000000`00000194 000007ff`fffa62c8 00000000`77c0c4fa : ntdll!ZwWaitForSingleObject+0xa00000000`0321f860 00000000`77c2e40b : 00000000`00000001 000007ff`fffdf000 00000000`77be0000 00000000`77d17490 : ntdll!RtlpWaitOnCriticalSection+0xe800000000`0321f910 00000000`77c0c5dd : 00000000`00000000 000007ff`fffa6000 000007ff`fffa62c8 00000000`00000000 : ntdll!RtlEnterCriticalSection+0xd100000000`0321f940 00000000`77c0c44f : 000007ff`fffdf000 00000000`00000000 000007ff`fffa6000 00000000`00000000 : ntdll!LdrpInitializeThread+0x8d00000000`0321fa40 00000000`77c0c34e : 00000000`0321fb00 00000000`00000000 000007ff`fffdf000 00000000`00000000 : ntdll!LdrpInitialize+0x9f00000000`0321fab0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!LdrInitializeThunk+0xe

After knowing the root cause of the problem, it is not particularly difficult to solve the problem. Once again, I will not explain how to solve it. So here is a deep lesson for me. Try not to implement too much logic in dllmain. You can use a stripped export function. After loading the dynamic link library, manually call the exported initialization function.

Finally, we recommend that you read the Microsoft documentation.<Dynamic-Link Library Best Practices>.


A bug caused by loader lock

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.