Understanding metadata and IL (I) <Article 3>

Source: Internet
Author: User
Tags mscorlib hosting

Before

I have been talking about metadata (metadata) and IL (intermediate language) ideas for a long time. At the beginning of this article, I have been very down-to-earth and devoted myself to these two class brothers, although it is not as fast as "First time: resentment: Is and as", metadata and Il are absolutely heavyweight content that deserves our attention at any time. This article is the beginning.

1 Introduction

Did you ever think of what exactly happens to our C # code after compilation? You have ever realized what the trajectory of our executable programs is like when we run them? Then, this article gradually gives the answer through understanding metadata (metadata) and IL (intermediate language, intermediate language. On this exploration track, the metadata, Il, assembly, program domain, JIT, virtual dispatch, method table, and hosting heap will be met by these mysterious guests at some point in time, as you must know. part of the net series version 2.0, This article first starts from understanding metadata and Il, two heavyweight players, and other guests will soon debut.

2 First Contact

In fact, the compiled CS code is organized into two basic elements: Metadata (metadata) and Il. We can understand the assembly in the simplest way (*. DLL) or an executable file (*. EXE) contains the metadata and IL secret. This method is often called decompilation. Open ildasm and load the Assembly for implementation preparation. We can see the related content of the hosted PE file:


For detailed structure information and IL code analysis, see [. Net] Chapter 3rd "Everything starts with IL". We will not perform much analysis here. In addition, you can run view/metainfo/Show !" Or press Ctrl + m to obtain the metadata information list used by the Assembly:


The Assembly uses metadata such as module, typeref, typedef, method, Param, memberref, costomattribute, assembly, and assemblyref, it also includes # strings, # guid, # blob, and # US heap.

Of course, there are many interesting ways to use the ildasm tool to satisfy our curiosity about exploring the Il code, such:

Ildasm anytao.insidenet.metadatail.exe/output: My. il to export the decompilation result to the Il code format and generate a my. il contains all the Il code and a my. res contains all resource files.

Ildasm anytao.insidenet.metadatail.exe/text to output The Decompilation result in console format.

Of course, we recommend that you view the Il details in the form of a GUI with a well-organized Class View:

Ildasm anytao.insidenet.metadatail.exe

 

The following describes the code files involved in compilation, and then discusses metadata and IL:

// Release : code01, 2009/02/12
// Author  : Anytao, http://www.anytao.com
// List    : One.cs
public class One
{
     public int ID { get; set; }
}// Release : code02, 2009/02/12
// Author  : Anytao, http://www.anytao.com
// List    : Two.cs
public class Two
{
     public string SayHello()
     {
         return "Hello, world.";
     }
}// Release : code03, 2009/02/12
// Author  : Anytao, http://www.anytao.com
// List    : Program.cs
class Program
{
     static void Main(string[] args)
     {
         int id = 1;
         One one = new One();
         one.ID = id;
         Two two = new Two();
         Console.WriteLine(two.SayHello());
     }
}

Next, we will explore the compilation and execution process of the above program, and use the command line compiler to evolve its general compilation process to further understand the hosting module, relationship between the Assembly and executable files:

Open Visual Studio 2008 command prompt, locate the CS code folder, compile one. CS as the hosted module, and execute the command:

T: Module One. CS

After execution, the file named one. netmodule is generated;

Continue to execute. Package Multiple modules into an assembly.

T: Library/addmodule: One. netmodule two. CS

After execution, the two. dll file is generated;

Finally, compile the main function and two. dll as executable files.

out: anytao.insidenet.w.datail.exe/T: EXE/R: two. dll/R: mscorlib. dll program. CS

At the beginning of this document, the program file anytao.insidenet.metadatail.exe used for decompilation will be used at the beginning of this document. In this command, the following instructions are provided:

/Out: anytao.insidenet.metadatail.exe, indicating the output executable file and its name

/T: EXE, indicating that the output file type is Cui (console interface program), And/T: winexe, indicating that the output is a GUI Program

/R: two. dll, indicating to reference the two. dll Assembly just produced

/R: mscorlib. dll, indicating that mscorlib. dll is an external Assembly because the console static method is used in our program, and this method is defined in mscorlib. dll. Mscorlib. dll is so important that we will shake hands with mscorlib. dll some time after this article. At that time, we will perform a detailed analysis on it, so stay tuned.

For the execution process in cmd, refer:


 

Through step-by-step execution, we have a basic understanding of the compiler execution process, and also understand the epitome of executing "build" or "rebuild" in Visual Studio each time. From the above analysis, we can simply see:


Note: in Visual Studio, the compilation is implemented in modules. The compilation results are saved in the OBJ directory and then merged into executable files in the bin directory. By default, the compilation process is incremental. Only the module that has been modified is compiled. I will provide a detailed process later.

At the same time, we can draw the following basic conclusions:

After the CS code is compiled, metadata and Il are generated and constitute the basic unit of the module.

An assembly composed of multiple managed modules also contains certain resource files, but is not embodied here.

An assembly or executable file is the basic unit of a logical organization. It complies with the Basic Windows PE file format and can be directly loaded and executed by x86 or x64windows.

3. Continue

One or more modules, coupled with the resource file, form an assembly, as the basic unit of the logical organization,


 

In fact, this diagram only gives a rough understanding of the basic components of an assembly at a coarse granularity. In fact, a program contains complex structures and elements, such as PE signature, managed resources, and strong name signature hash, the core elements are embodied in.

The Assembly List (manifest) contains the Assembly's self-description information, including assemblydef, filedef, manifestresourcedef, and exportedtypedef. In The Decompilation options, manifest contains detailed content. In section 3.1 of "You must know. Net", "Hello, world knows Il", I will not go into details here.

PE File Header, standard Windows PE header file (pe32 or pe32 +), basic PE file information, such as file type, creation time, local CPU information, etc.

The CLR header, including the CLR version, module metadata, and resources.

Resource file.

Run the view/statisctics menu to view the relevant statistics:

File size            : 5632
  PE header size       : 512 (496 used)    ( 9.09%)
  PE additional info   : 1691              (30.02%)
  Num.of PE sections   : 3
  CLR header size     : 72                 ( 1.28%)
  CLR meta-data size  : 2212               (39.28%)
  CLR additional info : 0                  ( 0.00%)
  CLR method headers  : 52                 ( 0.92%)
  Managed code         : 287               ( 5.10%)
  Data                 : 2048              (36.36%)
  Unaccounted          : -1242             (-22.05%)

  Num.of PE sections   : 3
    .text    - 3072
    .rsrc    - 1536
    .reloc   - 512

  CLR meta-data size  : 2212
    Module        -    1 (10 bytes)
    TypeDef       -    4 (56 bytes)      0 interfaces, 0 explicit layout
    TypeRef       -   25 (150 bytes)
    MethodDef     -    8 (112 bytes)     0 abstract, 0 native, 8 bodies
    FieldDef      -    1 (6 bytes)       0 constant
    MemberRef     -   29 (174 bytes)
    ParamDef      -    2 (12 bytes)
    CustomAttribute-   16 (96 bytes)
    StandAloneSig -    4 (8 bytes)
    PropertyMap   -    1 (4 bytes)
    Property      -    1 (6 bytes)
    MethodSemantic-    2 (12 bytes)
    Assembly      -    1 (22 bytes)
    AssemblyRef   -    1 (20 bytes)
    Strings       -   920 bytes
    Blobs         -   328 bytes
    UserStrings   -    68 bytes
    Guids         -    16 bytes
    Uncategorized -   192 bytes

  CLR method headers : 52
    Num.of method bodies  - 8
    Num.of fat headers    - 4
    Num.of tiny headers   - 4

  Managed code : 287
    Ave method size - 35

 

The PE Header, CLR header, and resource file will be discussed in detail in the later article deep assembly and module.

The IL code is organized

.class public auto ansi beforefieldinit Anytao.Insidenet.MetadataIL.Two
        extends [mscorlib]System.Object
     {
       .method public hidebysig instance string
               SayHello() cil managed
       {
         // Code size       11 (0xb)
         .maxstack  1
         .locals init ([0] string CS$1$0000)
         IL_0000:  nop
         IL_0001:  ldstr      "Hello, world."
         IL_0006:  stloc.0
         IL_0007:  br.s       IL_0009

         IL_0009:  ldloc.0
         IL_000a:  ret
       } // end of method Two::SayHello

       .method public hidebysig specialname rtspecialname 
               instance void  .ctor() cil managed
       {
         // Code size       7 (0x7)
         .maxstack  8
         IL_0000:  ldarg.0
         IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
         IL_0006:  ret
       } // end of method Two::.ctor

     } // end of class Anytao.Insidenet.MetadataIL.Two

Packaged in a style similar to assembly, I can see the class, system. familiar faces in object, method, public, and string object-oriented advanced languages. The difference is that there are a lot more benforefieldinit (refer to: [what you must know. net] 23rd back: taste details, in-depth. net Type constructor), RET, maxstack, ldstr, stloc, these unfamiliar commands. However, Il is not a freak, but it is based on its own object-oriented assembler style that makes il code a real "intermediate language. Through the Il code, CLR can be converted from JIT compilation to native code during compilation. We will continue to analyze the ins and outs of this process in the next section.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.