Mechanisms behind. net application programs
The question to be discussed in this article is how CLR operates when compiled applications run.
1. Preparations
Program list Program. cs:
public sealed class Pragram{public static void Main(){System.Console.WriteLine("Hi");}}
Compile the above Code in Developer Command Prompt for VS2013 to become an application (assembly)
csc.exe Program.cs
Run the program.
Program.exe
Use ILDasm to open the compiled assembly
ILDasm.exe Program.exe
The program can run.
2. CLR load and initialize itself
When program.exe or the program is run in CMD, Window checks the EXE file header and determines whether to load MSCorEE in the process address space after a 32-bit or 64-bit process is created. x86, x64, or ARM versions of dll.
If the operating system is x86 or ARM, the x86 version of MSCorEE. dll is in the following directory:
%SystemRoot%\System32
If the operating system is x64, The x86 version of MSCorEE. dll is in the following directory:
%SystemRoot%\SysWow64
The x64 version of MSCorEE. dll is in the following directory:
%SystemRoot%\System32
Then, the main thread of the process calls a method defined in MSCorEE. dll, which initializes CLR. Then the permission is assigned to the CLR.
3. Read the entry point from the CLR Header
After the CLR Initialization is complete, it reads the CLR header and looks for the Application entry mark.
We can use ILDasm (view → headers) to view the program entry mark.
In ILDasm, View → metaInfo → show, the entry point of the program that opens the metadata information window is 0x06000001. 06 indicates that the type of the tag is MethodDef, and 000001 indicates that it is the first row of the MethodDef table. Then define the table tag using this method to retrieve the MethodDef metadata table.
Locate the offset of the method in the IL code based on RVA.
Next, let's take a look at the Main function's IL code.
The IL code is as follows:
4. Type and method referenced by the code of the CLR detection Entry Point Method
Before running the Main function, its definition assembly is loaded based on the type and member reference referenced by the entry function (if not loaded ). For example, the above IL code contains a reference to System. Console. WriteLine. Specifically, the IL call command references the metadata token 0A000003, which indicates record item 3 in the MemberRef metadata table (table 0A. CLR checks the MembersRef record and finds that its field references record item 01000004 in the TypeRef table. According to this TypeRef item, CLR is directed to an AssemblyRef record item (23000001 ):
At this time, we know which Assembly it needs, and then the CLR will locate the loaded assembly.
5. Load the reference type assembly and create a data structure in the memory
CLR loads the mscorlib. dll file and scans metadata to locate the Console type. Then, CLR creates its internal data structure to represent the type.
In this internal data structure, each method of Console type definition has a corresponding record item. Each record item has an address. You can find the implementation method based on this address. During this structure initialization, each record item is set to a function JITCompiler (JIT compiler) inside the CRL)
6. JIT compiled IL as local code
After the CLR creates an internal data structure of the reference type, the JIT compiler completes compilation of the Mian method and the Main method starts to be executed. When the Main method calls WriteLine for the first time, the JITComplier function is called (because WriteLine points to the JITComplier function ). The JIT compiler knows which method to call and what type it defines. Then, JITComplier finds the IL code of the called method in the metadata of the Assembly where the type is located, then JITCompliers verifies the IL code, and compiles the IL code into the CPU command of the cost machine. The local CPU command is saved to the dynamically allocated memory. Then, JITComplier returns to the internal data structure created by CLR for the type, finds the record corresponding to the called method, and modifies the original reference to JITComplier, point it to the address of the memory block (including the compiled local CPU command), and finally jump from the JITComplier function to the Code in the memory block, which is exactly the implementation of the WriteLine method. After the code is executed and returned, the code in Main is returned and executed as usual.
When WriteLIne is called for the second time (WriteLine executes the memory block), this time, because the WriteLine code has been verified and compiled, the code in the memory block will be directly executed, skip the JITComplier function. After the Write function is executed, it will return to Main for further execution.
7. Exit of the program
The JIT compiler stores local CPU commands in dynamic memory. This means that once the application is terminated, the compiled code will be discarded. Therefore, to run the application again in the future, or to start two instances of the application at the same time, the JIT compiler must re-compile the IL code to compile the machine commands. This may significantly increase memory consumption. However, in general, the performance loss caused by the JIT compiler is not significant because most applications call the same method repeatedly. The program is running. These methods only have a one-time impact on performance.