Metadata and CLR (1)

Source: Internet
Author: User
Analysis Tool: UtralEdit is used for metadata, Sos debugging expansion is used for memory layout, and vs2005 memory, registers, and Disassembly window information is used for memory layout.

Step: Use UE to open any. net dll or Exe file and analyze static metadata.

Go to the debugging status, and analyze the execution of CLR Based on the SOS and debugger information (MethodTable layout in 2.0 and 1.1 are greatly changed and cannot be understood. Is there any relevant information ?)

The internal implementation of. net statements can be seen through IL and metadata, while the implementation of IL can only be achieved through disassembly Information

Metadata and metadata table

Generally, "meta" is added to the object created to describe other objects. Metadata is the data that describes other data. net, other data refers.. net objects, referenced objects, and their relationships.

Meta-metadata is the data that describes metadata. They describe the composition of metadata. With them, we can locate each record in the metadata table.

The logical structure of metadata is similar to the table in the database, so it is called a metadata table. The metadata table has columns and rows. Each row is unique and has a unique RID (Table Index). The RID is like the primary Creation of a database table. The database has a Schema, such as the type and size of each column, and metadata is similar. They are called "meta-metadata", which contains the record size and column size in the table, offset.

Relatively fixed part of PE (32-bit System)

IMAGE_DOS_HEADER occupies 40 h byte from 00 h --- 39 h. From 0x 3c The last four bytes at the beginning are e_lfanew, which is the file pointer to pe signature. Pe signature is 00 00 45 50, which occupies four bytes. The real-mode residual process is between e_lfanew and pe signature, from 40 h to 79 h. 80-83 is a four-byte pe signature.

Followed by IMAGE_FILE_HEADER (coff header) from 84 h-97 h, accounting for 14 h

Peheader from 98h-178h E0h (224 bytes)

In Pe header: the 96-byte offset (32-bit pe header) starting from 98h is _ IMAGE_DATA_DIRECTORY. A total of 16, each table occupies eight-character section.

Followed by the region header, each of which occupies 28 h (40). There are three in total (the number is defined by Numberofsections with the Coff header): from 178 h- 1a 0 h is. text

1a 0 -1c 8 is rsrc

1c 8 -1f 0 is. rsloc

 

File pointer and RVA

Before the file is loaded to memory, the offset of the item in the file, Rva, and Va are the relative address (offset) and address after the file is loaded to memory.

RID and Token

The RID is a row index of a metadatabase table and can only be referenced between metadatabase tables. For example, in the TypeDef table, the RID of the first field contained in this type is located in the Filed table.

You can use the column type code (meta-Metadata defined) to determine the table referenced by the column (implicit). Therefore, the RID can also locate the table, but cannot be accessed outside the metadata.

The Token is used to add the RID to the table index. They explicitly determine which table contains the RID, and therefore can be referenced externally. In IL, variables, constants, and so on are referenced through token, and they will be compiled as Token. There are 24 tables with the token type, and the other 20 tables have no token. They cannot be referenced externally and can only be referenced between metadata.

There is a special token type 0x70000000. The RID part of the Token is not the real RID, because they exist in the # US stream, which contains user-defined data, without any metadata information, the metadata will not be referenced to # Us. The RID part is the offset of the user string in # US. It can be used to locate the User-Defined string. For a common Token, you must first locate the RID, and then locate the metadata based on the column type and other information of the metadata table.

Stream and heap

There is no relation between the heap here and the heap in the data structure. It is two concepts. It can be divided into three types: String, GUID, blob

Metadata provides six naming heap types:

# String: The content referenced by the metadata, such as the class name, method name, and variable name.

# US: User-Defined blob heap (not a String Heap), including String constants, which can be directly addressed by the ldstr command. Metadata cannot be referenced directly, but can be referenced by IL and external APIs. For example, string s = "dd"; dd is stored in # us, s is stored in # string

# Blob: binary object referenced by metadata, which cannot contain user-defined objects.

# GUID: Unique Identifier, such as the Mvid of the Modle metadata table.

#~ And #-(only one image file can be included): Metadata data streams, including metadata headers and metadata tables, the most complex heap. It will reference (# String, # blob, # GUID Stream)

Flag and Signature

Flag contains the visibility, Layout (Layout), type semantics, implementation, string formatting, and other flags. With Flag, we can confirm these types of information.

Signature :( blob Stream)

Locate Token

Take the User-Defined string as an example (# US ):

The user string is defined in # US stream. The calculation method is: the first address of the stream (that is, the metadata header address, because the stream header is in the metadata header) + # the offset of the US stream (offset and then) then, the offset of the string (the RID part of the token) is the offset of the string in the # US stream.

Locate the metadata table

First, find the metadata header (metaDataHeader). The method is the same as that of Cor2.0Header, except that Rva is replaced with the Rva of the MetaDataDir table in Cor2.0Header. The metadata header consists of STORAGESIGNATURE, STORAGEHEADE, and stream header. It is followed by STORAGESTREAM, which contains # String, # Blob, # Guid, # US (User String) and #~ Stream

Metadata tables are stored in #~ In the stream, locate the metadata header, and add the offset and metadata to locate the table. For details, see locate Cor2.0Header.

Section Header)

There are three regional headers, each of which occupies 28 h (40) (the number of headers is defined by Numberofsections with Coff headers): from 178 h- 1a 0 h is. text

1a 0 -1c 8 is rsrc

1c 8 -1f 0 is. rsloc

Locate PE Header (data directory table)

The data directory table is located in the PE Header. The 96-byte offset (32-bit pe Header) starting from 98h is _ IMAGE_DATA_DIRECTORY. A total of 16, each table occupies eight-character section.

typedef struct _IMAGE_DATA_DIRECTORY {
    DWORD   VirtualAddress;
    DWORD   Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

 

In the preceding three sections, the data pointed to by the data directory table can be determined based on the VirtualAddress (rva) of _ IMAGE_DATA_DIRECTORY, which falls into the Section: VirtualAddress <rva <VirtualAddress + SizeOfRawData.

The top 15th CLI headers are most closely related to the Cli in the sixteen tables. The preceding method can be used to determine the region in which the CLI is located.

 

Locate Cor2.0Header

If you know the Section, you can locate the Cor2.0Header. According to the calculation, the Section is. text, and the address will be different each time it is compiled. Based on the CliHeader Rva and the virtualAddress and PointerToRawData of this section, the address of COR20Header can be calculated: (rva-virtualAddress) + PointerToRawData

The Cor header contains seven tables, and the last one is a reserved table (2.0 only found the first six, but 2.0 still does not use this table). What is important is the MetaDataDir table, the address can be calculated in the same way.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.