View the EXE loading process in. net from the source code

Source: Internet
Author: User
I saw a good article from xuexue University and shared it with everyone.
Http://bbs.pediy.com/showthread.php? Threadid = 31799

The source code here is not the source code of. NET Framework. However, Microsoft discloses the source code of Open Source CLI code named rotor. You can regard it as a lightweight. NET Framework. The most important thing is that they have the same operating mechanism. Today, we will look at the source code of the rotor to see the dynamic loading of the most basic EXE file for program debugging. Similarly, the references should be provided first, so that no one may say I copied the documents. Inside the rotor CLI, another is Shared Source CLI, but it cannot be found online. Of course, you need to download the sscli2.0 compressed package from the msdn website.
Like Win32, the system provides a loader to read the EXE, And the sscli provides another loader example: clix.exe. For the moment, we will regard it as the default loader of the system. Let's look at the source code (Clix. cpp) and pay attention to the red code.

Code:
DWORD launch (wchar * pfilename, wchar * pcmdline) {wchar exefilename [max_path + 1]; DWORD dwattrs; DWORD dwerror; DWORD nexitcode ;... // check the attributes of a series of files... if (dwerror! = Error_success) {// we can't find the file, or there's some other problem. exit with an error. fwprintf (stderr, l "% s:", pfilename); displaymessagefromsystem (dwerror); return 1; // error} nexitcode = _ corexemain2 (null, 0, pfilename, null, pcmdline); // _ corexemain2 never returns with success _ asserte (nexitcode! = 0); displaymessagefromsystem (: getlasterror (); Return nexitcode ;}

Here we see the famous corexemain. Do you still remember to use the PE editing file to open the. netpe file and only introduce one function? Mscoree. dll! _ Corexemain. Strange, why not _ corexemain2? This is just a difference between rotor and commercial framework. You can use idapro to reverse mscoree. dll, and you can see that _ corexemain () is just a transit, the Code is as follows:

Code:
.text:79011B47                 push    offset a_corexemain ; "_CorExeMain".text:79011B4C                 push    [ebp+hModule]   ; hModule.text:79011B4F                 call    ds:__imp__GetProcAddress@8 ; GetProcAddress(x,x).text:79011B55                 test    eax, eax.text:79011B57                 jz      loc_79019B46.text:79011B5D                 call    eax

The _ corexemain of mscorwks. dll is called immediately. This function is similar to the function provided by _ corexemain2 just mentioned in the rotor, so it starts the initialization of EXE loading. These can be seen from the comparison between the disassembly code and the source code. Continue back to sscli and check the code of _ corexemain2 () (ceemain. cpp)

Code:
__int32 STDMETHODCALLTYPE _CorExeMain2( // Executable exit code.    PBYTE   pUnmappedPE,                // -> memory mapped code    DWORD   cUnmappedPE,                // Size of memory mapped code    __in LPWSTR  pImageNameIn,          // -> Executable Name    __in LPWSTR  pLoadersFileName,      // -> Loaders Name    __in LPWSTR  pCmdLine)              // -> Command Line{    // This entry point is used by clix    BOOL bRetVal = 0;    //BEGIN_ENTRYPOINT_VOIDRET;    // Before we initialize the EE, make sure we've snooped for all EE-specific    // command line arguments that might guide our startup.    HRESULT result = CorCommandLine::SetArgvW(pCmdLine);    if (!CacheCommandLine(pCmdLine, CorCommandLine::GetArgvW(NULL))) {        LOG((LF_STARTUP, LL_INFO10, "Program exiting - CacheCommandLine failed\n"));        bRetVal = -1;        goto exit;    }    if (SUCCEEDED(result))        result = CoInitializeEE(COINITEE_DEFAULT | COINITEE_MAIN);    if (FAILED(result)) {        VMDumpCOMErrors(result);        SetLatchedExitCode (-1);        goto exit;    }    // This is here to get the ZAPMONITOR working correctly    INSTALL_UNWIND_AND_CONTINUE_HANDLER;    // Load the executable    bRetVal = ExecuteEXE(pImageNameIn);......    


Most code can be skipped. The key is two. One is to initialize the EE (execute engine). After the initialization is successful, executeexe is called. The parameter is the file name. Here we can clearly see what is the input parameter of _ corexemain. Executeexe () does not have much Code and is also a stepping stone:

Code:

BOOL STDMETHODCALLTYPE ExecuteEXE(HMODULE hMod){    STATIC_CONTRACT_GC_TRIGGERS;    _ASSERTE(hMod);    if (!hMod)        return FALSE;    ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_EXEC_EXE);    TIMELINE_START(STARTUP, ("ExecuteExe"));    EX_TRY_NOCATCH    {        // Executables are part of the system domain        SystemDomain::ExecuteMainMethod(hMod);    }    EX_END_NOCATCH;    ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_EXEC_EXE+1);    TIMELINE_END(STARTUP, ("ExecuteExe"));    return TRUE;}       

 

Similarly, the key code has only one line, systemdomain: executemainmethod (hmod ). Specifically, the executemainmethod serves as a module for the passed-in file. In. net, if you want to calculate by the inclusion relation, assembly> module> class> method. That is to say, each assembly may contain multiple modules, and at least one module has only one mainmethod, which is the entry method.

Go to the systemdomain: executemainmethod () code (assembly. cpp)
Code:

    INT32 Assembly::ExecuteMainMethod(PTRARRAYREF *stringArgs){    CONTRACTL    {        INSTANCE_CHECK;        THROWS;        GC_TRIGGERS;        MODE_ANY;        ENTRY_POINT;        INJECT_FAULT(COMPlusThrowOM());    }    CONTRACTL_END;    HRESULT hr = S_OK;    INT32   iRetVal = 0;    BEGIN_ENTRYPOINT_THROWS;    Thread *pThread = GetThread();    MethodDesc *pMeth;    {        // This thread looks like it wandered in -- but actually we rely on it to keep the process alive.        pThread->SetBackground(FALSE);            GCX_COOP();        pMeth = GetEntryPoint();        if (pMeth) {            RunMainPre();            hr = ClassLoader::RunMain(pMeth, 1, &iRetVal, stringArgs);        }    }    //RunMainPost is supposed to be called on the main thread of an EXE,    //after that thread has finished doing useful work.  It contains logic    //to decide when the process should get torn down.  So, don't call it from    // AppDomain.ExecuteAssembly()    if (pMeth) {        if (stringArgs == NULL)            RunMainPost();    }    else {        StackSString displayName;        GetDisplayName(displayName);        COMPlusThrowHR(COR_E_MISSINGMETHOD, IDS_EE_FAILED_TO_FIND_MAIN, displayName);    }    if (FAILED(hr))        ThrowHR(hr);    END_ENTRYPOINT_THROWS;    return iRetVal;}   

 

There are two key steps: Prepare the thread environment and run the main method. Next, let's look at classloader: runmain in clsload. cpp, which is our last website this time.
Code:

HRESULT ClassLoader::RunMain(MethodDesc *pFD ,                             short numSkipArgs,                             INT32 *piRetVal,                             PTRARRAYREF *stringArgs /*=NULL*/){    STATIC_CONTRACT_THROWS;    _ASSERTE(piRetVal);    DWORD       cCommandArgs = 0;  // count of args on command line    DWORD       arg = 0;    LPWSTR      *wzArgs = NULL; // command line args    HRESULT     hr = S_OK;    *piRetVal = -1;    // The exit code for the process is communicated in one of two ways.  If the    // entrypoint returns an 'int' we take that.  Otherwise we take a latched    // process exit code.  This can be modified by the app via setting    // Environment's ExitCode property.    if (stringArgs == NULL)        SetLatchedExitCode(0);    if (!pFD) {        _ASSERTE(!"Must have a function to call!");        return E_FAIL;    }    CorEntryPointType EntryType = EntryManagedMain;    ValidateMainMethod(pFD, &EntryType);    if ((EntryType == EntryManagedMain) &&        (stringArgs == NULL)) {        // If you look at the DIFF on this code then you will see a major change which is that we        // no longer accept all the different types of data arguments to main.  We now only accept        // an array of strings.        wzArgs = CorCommandLine::GetArgvW(&cCommandArgs);        // In the WindowsCE case where the app has additional args the count will come back zero.        if (cCommandArgs > 0) {            if (!wzArgs)                return E_INVALIDARG;        }    }    ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_MAIN);    TIMELINE_START(STARTUP, ("RunMain"));    EX_TRY_NOCATCH    {        MethodDescCallSite  threadStart(pFD);                PTRARRAYREF StrArgArray = NULL;        GCPROTECT_BEGIN(StrArgArray);        // Build the parameter array and invoke the method.        if (EntryType == EntryManagedMain) {            if (stringArgs == NULL) {                // Allocate a COM Array object with enough slots for cCommandArgs - 1                StrArgArray = (PTRARRAYREF) AllocateObjectArray((cCommandArgs - numSkipArgs), g_pStringClass);                // Create Stringrefs for each of the args                for( arg = numSkipArgs; arg < cCommandArgs; arg++) {                    STRINGREF sref = COMString::NewString(wzArgs[arg]);                    StrArgArray->SetAt(arg-numSkipArgs, (OBJECTREF) sref);                }            }            else                StrArgArray = *stringArgs;        }#ifdef STRESS_THREAD        OBJECTHANDLE argHandle = (StrArgArray != NULL) ? CreateGlobalStrongHandle (StrArgArray) : NULL;        Stress_Thread_Param Param = {pFD, argHandle, numSkipArgs, EntryType, 0};        Stress_Thread_Start (&Param);#endif        ARG_SLOT stackVar = ObjToArgSlot(StrArgArray);        if (pFD->IsVoid())         {            // Set the return value to 0 instead of returning random junk            *piRetVal = 0;            threadStart.Call(&stackVar);        }        else         {            *piRetVal = (INT32)threadStart.Call_RetArgSlot(&stackVar);            if (stringArgs == NULL)             {                SetLatchedExitCode(*piRetVal);            }        }        GCPROTECT_END();        fflush(stdout);        fflush(stderr);    }    EX_END_NOCATCH    ETWTraceStartup::TraceEvent(ETW_TYPE_STARTUP_MAIN+1);    TIMELINE_END(STARTUP, ("RunMain"));    return hr;}


These codes are mainly used to prepare and run the methods before they are finally run. There are two types: void () and return value. The following running situation goes deep into the core of the Framework. You can try again later. Many definitions of COM are used in the code, and the close relationship between. NET and COM is also seen. Like debugger and profiler in. net, they even directly call the COM interface for compilation. I am not familiar with COM, so I cannot go into depth on this issue.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.