Windows 2 k DLL Loading Process
Jefong by 2005/03/30
This article is my summary after reading MSJ September 1999 under the hood.
In Windows, the EXE executable program calls some DLL, such as kernel32.dll and user32.dll. But how is the DLL loaded? Generally, we all know that there will be a dllmain entry function when writing a DLL, but in fact this function is not the first job to call the DLL. First, the DLL needs to be loaded, and then initialize and allocate the DLL before entering the dllmain. It is also possible that one of your dll will call another DLL. So how is the DLL loaded and initialized? Let's refer to "dynamic-Link Library entry-point function" in the Platform SDK ".
Your function is executing an initialization task, such as setting TLS, creating a synchronization object or opening a file. Therefore, do not call the loadlibrary function in the function, because the DLL load command creates a dependency loop. This will cause the DLL function to be called before the system executes the DLL initialization code. For example, you cannot call the freelibrary function in the entry function, because this will cause the system to call the operations in the DLL after the DLL has been completed, causing a serious error.
Calling Win32 functions during Task initialization may also cause errors. For example, calling user, shell, and COM functions may cause invalid storage errors, some functions in DLL call loadlibrary to load other system components.
When you read a registry key value in your dllmain function, this will be restricted, because under normal circumstances, advapi32.dll is not initialized when you execute the dllmain code, therefore, the function you call to read the registry will fail.
In the initialization section of this document, the use of loadlibrary functions is strictly restricted. However, in special cases, user32.dll in WindowsNT ignores the above restrictions. In this way, it seems to be different from the above mentioned. In the initialization part of user32.dll, the part that calls loadlibrary to load dll appears, but there is no problem. This is because of appinit_dlls. appinit_dlls can call a DLL list for any process. Therefore, if your user32.dll call fails, it must be that appinit_dlls is not working.
Next, let's take a look at how DLL loading and initialization are completed. The operating system has a loader. loading a module usually involves two steps: 1. after the EXE or DLL image is mapped to the memory, the loader checks the module's import Address Table (IAT) to check whether the module depends on the attached DLL. If the DLL has not been loaded into the process, the loader will map the DLL to the memory. Until all unloaded modules are mapped to the memory. 2. Initialize all DLL files. In Windows NT, the program that calls the EXE and DLL entry functions first calls the ldrpruninitializeroutines function, that is, when you call loadlibrary, ldrpruninitializeroutines is called, when ldrpruninitializeroutines is called, the system first checks whether the DLL mapped to the memory has been initialized. Let's take a look at the following code (Matt's ldrpruninitializeroutines pseudocode ):
// ================================================ ========================================================
// Matt pietrek, September 1999 Microsoft Systems Journal
// The Chinese comments are translated into jefong.
//
// Pseudo docode for ldrpruninitializeroutines in Ntdll. dll (NT 4, SP3)
//
// When ldrpruninitializeroutines is called for the first time in a process (the implicit link module of the process has been initialized), The bimplicitload parameter is non-zero. When loadlibrary is used to call DLL, The bimplicitload parameter is zero;
// ================================================ ========================================================
# Include <ntexapi. h> // For harderror defines near the end
// Global symbols (name is accurate, and comes from NTDLL. DBG)
// _ Ntdllbasetag
// _ Showsnaps
// _ Savesp
// _ Cursp
// _ Ldrpinldrinit
// _ Ldrpfatalharderrorcount
// _ Ldrpimagehastls
Ntstatus
Ldrpruninitializeroutines (DWORD bimplicitload)
{
// Obtain the number of modules that may need to be initialized. Some modules may have been initialized.
Unsigned nroutinestorun = _ ldrpclearloadinprogress ();
If (nroutinestorun)
{
// If there are modules to be initialized, assign them a queue to load information about each module.
Pinitnodearray = _ rtlallocateheap (getprocessheap (),
_ Ntdllbasetag + 0x60000,
Nroutinestorun * 4 );
If (0 = pinitnodearray) // make sure allocation worked
Return status_no_memory;
}
Else
Pinitnodearray = 0;
// The second part;
// Process environment block (peb), which contains a pointer to the Link List of the newly loaded module.
Pcurrnode = * (pcurrentpeb-> moduleloaderinfohead );
Moduleloaderinfohead = pcurrentpeb-> moduleloaderinfohead;
If (_ showsnaps)
{
_ Dbuplint ("LDR: Real init list/N ");
}
Nmodulesinitedsofar = 0;
If (pcurrnode! = Moduleloaderinfohead) // determines whether a newly loaded module exists.
{
While (pcurrnode! = Moduleloaderinfohead) // traverses all newly loaded modules
{
Moduleloaderinfo pmoduleloaderinfo;
//
// The Node size of a moduleloaderinfo structure is 0x10 bytes.
Pmoduleloaderinfo = & nextnode-0x10;
Localvar3c = pmoduleloaderinfo;
//
// If the module has been initialized, ignore it.
// X_loader_saw_module = 0x40 initialized
If (! (Pmoduleloaderinfo-> flags35 & x_loader_saw_module ))
{
//
// The module is not initialized. Check whether the module has an entry function.
//
If (pmoduleloaderinfo-> entrypoint)
{
//
// Has an initialization function. Add it to the module list and wait for initialization.
Pinitnodearray [nmodulesinitedsofar] = pmoduleloaderinfo;
// If showsnaps is non-zero, print the module path and entry function address.
// Example:
// C:/winnt/system32/kernel32.dll init routine 77f01000
If (_ showsnaps)
{
_ Dbuplint ("% WZ init routine % x/N ",
& Pmoduleloaderinfo-> 24,
Pmoduleloaderinfo-> entrypoint );
}
Nmodulesinitedsofar ++;
}
}
// Set the x_loader_saw_module flag of the module. This indicates that this module has not been initialized.
Pmoduleloaderinfo-> flags35 & = x_loader_saw_module;
// Process the next module Node
Pcurrnode = pcurrnode-> pnext
}
}
Else
{
Pmoduleloaderinfo = localvar3c; // may not be initialized ???
}
If (0 = pinitnodearray)
Return STATUS_SUCCESS;
// ************************** MSJ layout! *****************
// If you're going to split this code into SS pages, this is a great
// Spot to split the code. Just be sure to remove this comment
// ************************** MSJ layout! *****************
//
// Pinitnodearray pointer contains a module pointer queue, which does not have dll_process_attach
// Part 3: Call Initialization
Try // wrap all this in a try block, in case the init routine faults
{
Nmodulesinitedsofar = 0; // start at array element 0
//
// Traverse the module queue
//
While (nmodulesinitedsofar <nroutinestorun)
{
// Obtain the module pointer
Pmoduleloaderinfo = pinitnodearray [nmodulesinitedsofar];
// This doesn't seem to do anything...
Localvar3c = pmoduleloaderinfo;
Nmodulesinitedsofar ++;
// Save the initialization program entry pointer
Pfninitroutine = pmoduleloaderinfo-> entrypoint;
Fbreakondllload = 0; // default is to not break on Load
// For debugging
// If this process is a debuggee, check to see if the loader
// Shocould break into a debugger before calling the initialization.
//
// Debuggerpresent (offset 2 in peb) is what isdebuggerpresent ()
// Returns. isdebuggerpresent is an NT only API.
//
If (pcurrentpeb-> debuggerpresent | pcurrentpeb-> 1)
{
Long retcode;
//
// Query the "HKEY_LOCAL_MACHINE/software/Microsoft/
// Windows NT/CurrentVersion/Image File Execution options"
// Registry key. If a subkey entry with the name
// The executable exists, check for the breakondllload value.
//
Retcode =
_ Ldrqueryimagefileexecutionoptions (
Pmoduleloaderinfo-> pwszdllname,
"Breakondllload", pinitnodearray
REG_DWORD,
& Fbreakondllload,
Sizeof (DWORD ),
0 );
// If reg value not found (usually the case), then don't
// Break on this DLL init
If (retcode <= STATUS_SUCCESS)
Fbreakondllload = 0; pinitnodearray
}
If (fbreakondllload)
{
If (_ showsnaps)
{
// Inform the debug output stream of the module name
// And the init routine address before actually breaking
// Into the debugger
_ Dbuplint ("LDR: % WZ loaded .",
& Pmoduleloaderinfo-> pmoduleloaderinfo );
_ Dbuplint ("-about to call init routine at % LX/N ",
Pfninitroutine)
}
// Break into the debugger
_ Dbgbreakpoint (); // an int 3, followed by a RET
}
Else if (_ showsnaps & pfninitroutine)
{
// Inform the debug output stream of the module name
// And the init routine address before calling it
_ Dbuplint ("LDR: % WZ loaded .",
Pmoduleloaderinfo-> pmoduleloaderinfo );
_ Dbuplint ("-calling init routine at % LX/N", pfninitroutine );
}
If (pfninitroutine)
{
// Set the dll_process_attach flag
//
// (Shouldn't this come * after * the actual call ?)
//
// X_loader_called_process_attach = 0x8
Pmoduleloaderinfo-> flags36 | = x_loader_called_process_attach;
//
// If there's Thread Local Storage (TLS) for this module,
// Call the TLS init functions. *** note *** this only
// Occurs during the first time this code is called (when
// Implicitly loaded DLLs are initialized). dynamically
// Loaded DLLs shouldn't use TLS declared vars, as per
// SDK documentation
// If the module needs to allocate TLS, call the TLS initialization function.
// Note that only the first call (bimplicitload! = 0) to allocate TLS, that is, when the implicit DLL is loaded
// When dynamic loading (bimplicitload = 0), you do not need to declare the TLS variable.
If (pmoduleloaderinfo-> bhastls & bimplicitload)
{
_ Ldrpcalltlsinitializers (pmoduleloaderinfo-> hmoddll,
Dll_process_attach );
}
Hmoddll = pmoduleloaderinfo-> hmoddll
MoV ESI, esp // save off the ESP Register into ESI
// Set the entry function pointer
MoV EDI, dword ptr [pfninitroutine]
// In C ++ code, the following ASM wowould look like:
//
// Initretvalue =
// Pfninitroutine (hinstdll, dll_process_attach, bimplicitload );
//
Push dword ptr [bimplicitload]
Push dll_process_attach
Push dword ptr [hmoddll]
Call EDI // call the entry function
MoV byte PTR [initretvalue], Al // Save the return value of the entry function
MoV dword ptr [_ savesp], ESI // save stack values after
MoV dword ptr [_ cursp], esp // entry point code returns
MoV ESP, ESI // restore ESP to value before the call
//
// Check whether the ESP value before and after the call is one
//
If (_ cursp! = _ Savsp)
{
Harderrorparam = pmoduleloaderinfo-> fulldllpath;
Harderrorretcode =
_ Ntraiseharderror (
Status_bad_dll_entrypoint | 0x10000000,
1, // number of parameters
1, // unicodestringparametersmask,
& Harderrorparam,
Optionyesno, // Let user decide
& Harderrorresponse );
If (_ ldrpinldrinit)
_ Ldrpfatalharderrorcount ++;
If (harderrorretcode> = STATUS_SUCCESS)
& (Responseyes = harderrorresponse ))
{
Return status_dll_init_failed;
}
}
//
// The entry function returns 0, error
//
If (0 = initretvalue)
{
DWORD harderrorparam2;
DWORD harderrorresponse2;
Harderrorparam2 = pmoduleloaderinfo-> fulldllpath;
_ Ntraiseharderror (status_dll_init_failed,
1, // number of parameters
1, // unicodestringparametersmask
& Harderrorparam2,
Optionok, // OK is only response
& Harderrorresponse2 );
If (_ ldrpinldrinit)
_ Ldrpfatalharderrorcount ++;
Return status_dll_init_failed;
}
}
}
//
// If the EXE already has TLS, the call of the TLS initialization function is also the first time the process initializes the DLL.
//
If (_ ldrpimagehastls & bimplicitload)
{
_ Ldrpcalltlsinitializers (pcurrentpeb-> processimagebase,
Dll_process_attach );
}
}
_ Finally
{
//
// Part 4;
// Clear allocated memory
_ Rtlfreeheap (getprocessheap (), 0, pinitnodearray );
}
Return STATUS_SUCCESS;
}
This function is divided into four main parts:
I. The first part calls the _ ldrpclearloadinprogress function. This NTDLL function returns the number of DLL files that have been mapped to the memory. For example, if your process calls exm. dll and exm. dll calls exm1.dll and exm2.dll, _ ldrpclearloadinprogress returns 3. After obtaining the DLL count, call _ rtlallocateheap to return a memory queue pointer. The queue pointer in the pseudo code is pinitnodearray. Each node pointer in the queue points to the structure information of a newly loaded DLL.
II. The second part of the code obtains a list of links to the newly loaded DLL through the internal data structure of the process. Check whether the DLL has an entry pointer. If yes, add the module information pointer to pinitnodearray. The module information pointer in the pseudo code is pmoduleloaderinfo. However, some DLL files are resource files and do not have entry functions. Therefore, the number of nodes in the pinitnodearray is smaller than that returned by _ ldrpclearloadinprogress.
3: the code in the third part enumerates the objects in the pinitnodearray and calls the entry function. Because this part of the initialization code may have errors, the _ Try exception flag function is used. This is why the entire process will not be terminated after an error occurs in dllmain.
In addition, TLS is initialized when the entry function is called. When _ declspec is used to declare the TLS variable, the data contained in the linker can be triggered. When calling the DLL entry function, the ldrpruninitializeroutines function checks whether a TLS instance needs to be initialized. If necessary, it calls _ ldrpcalltlsinitializers.
In the final pseudo-code section, use the assembly language to call the DLL entry function. Call EDI is the main command; EDI is the pointer to the entry function. After this command is returned, the DLL Initialization is complete. For a DLL written in C ++, dllmain has executed its dll_process_attach code. Note that the third parameter pvreserved of the entry function is non-zero when EXE or DLL implicitly calls DLL, and zero when loadlibrary is used. After the entry function is called, the loader checks the ESP values before and after the entry function is called. If the value is different, the DLL initialization function reports an error. After ESP is checked, the return value of the entry function is also checked. If it is zero, it indicates what problems occurred during initialization. The system reports an error and stops calling the DLL. At the end of Part 3, after initialization, if the EXE process already has TLS and the implicitly called dll has been initialized, _ ldrpcalltlsinitializers will be called.
4: the fourth part of the code is to clean up the code. The memory of the pinitnodearray allocated by _ rtlallocateheap needs to be released. The release code appears in the _ Finally block and _ rtlfreeheap is called.