- Layout of the entire file
- File header
- Index Area
- String_ids
- Type_ids
- Proto_ids
- Field_ids
- Method_ids
- Data area
- Appendix
- The Java code for test Dex
- Resources
Layout of the entire file
The entire Dex file can be divided into three parts, the file header, the index area, and the data area, as shown in.
- File header
Record an overview of the Dex file, including file size, checksum, and offset and size of other fields
- Index Area
Index of information that records string constants, types, method Prototypes, fields, methods
- Data area
Contains the definition area of the class Class_defs, which records the class information, and the data area, which contains the actual information (string, code), etc. link_data is a linked data area, mainly related to the dependent library
Using the 010 editor plus its official Dex template, you can easily and intuitively view the contents of your Dex file.
File header
Offset Address |
Field name |
size (byte) |
Description |
0 |
MAGIC[8] |
8 |
Magic number, used to identify the Dex file with the content "Dex\n035\0" |
8 |
Checksum |
4 |
File Check Code |
C |
SIGNATURE[20] |
SHA-1 Signature |
|
20 |
File_size |
4 |
Dex File Total length |
24 |
Header_size |
4 |
File header size, generally fixed to 0x70 |
28 |
Endan_tag |
4 |
Size end flag, flag Dex file format for small end, generally fixed to 0x12345678 |
2 C |
Link_size |
4 |
The size of the link segment |
30 |
Link_off |
4 |
Base address for link segments |
34 |
Map_off |
4 |
Base address for map item data |
38 |
String_ids_size |
4 |
Number of String Constants list |
3C |
String_ids_off |
4 |
Base address for list of string constants |
40 |
Type_ids_size |
4 |
Number of types |
44 |
Type_ids_off |
4 |
Base Address of type |
48 |
Proto_ids_size |
4 |
Number of method prototypes |
4C |
Proto_ids_off |
4 |
Base address of the method prototype |
50 |
Field_ids_size |
4 |
Number of fields |
54 |
Field_ids_off |
4 |
Base address of the domain |
58 |
Method_ids_size |
4 |
Number of methods |
5C |
Method_ids_off |
4 |
Base address of the method |
60 |
Class_defs_size |
4 |
Number of Class_def |
64 |
Class_defs_off |
4 |
Base Address for Class_def |
68 |
Data_size |
4 |
Size of data segment |
6C |
Data_off |
4 |
Base address of the data segment |
Index Area String_ids
This area stores the index information for string constants . String constants are not just strings defined in code, but also include all class names, method names, type names, and so on.
This area stores a base address, that is, the actual content of the string in the Dex file (in the data area), which is read by the Dalvik virtual machine and converted to the following data structure
// /android4.0.4/dalvik/libdex/DexClass.h/* * Direct-mapped "string_id_item". */struct DexStringId { u4 stringDataOff; /* file offset to string_data_item */};
Reads a string constant based on the offset address Stringdataoff, which string_data_item the data structure of the memory as
struct string_data_item { //字符串长度 //字符串}
Use 010 Editor to view the String_ids area of a test Dex file as follows
Type_ids
This area stores index information for all data types in the Dex file, including class type, array type (types), base type (primitive types). Its data structure is
/* * Direct-mapped "type_id_item". */struct DexTypeId { u4 descriptorIdx; /* index into stringIds list for type descriptor */};
Unlike String_ids, DESCRIPTORIDX is not an offset address, but an index number in string_ids, such as Descriptoridx=8, which represents string_ids[8]= "V", then the type is void. For example, according to the file header to find Typd_ids base address is a8h,size 7, from a8h the first 4 bytes for 0x0003, that is string_idsp[3], that is "Lfoo", representing the class type Foo
Proto_ids
Proto is the prototype of the method, containing the input and output parameters of the method, each with a size of 12 bytes. Its data structure is
/* * Direct-mapped "proto_id_item". */struct DexProtoId { u4 shortyIdx; /* index into stringIds for shorty descriptor */ u4 returnTypeIdx; /* index into typeIds list for return type */ u4 parametersOff; /* file offset to type_list for parameter types */};
- Shorty_idx
Like Type_ids, its value is a string_ids index number, which is ultimately a short string description,
- Returntypeidx
return type, value is the index number of the Type_ids
- Parametersoff
Base address of the parameter
Such as
Field_ids
In this area there are all fields referenced by the Dex file (that is, class attributes, including static) indexes, each with a size of 12 bytes. Its data structure is
struct DexFieldId { u2 classIdx; /* index into typeIds list for defining class */ u2 typeIdx; /* index into typeIds for field type */ u4 nameIdx; /* index into stringIds for field name */};
- Classidx
The class to which the field belongs, the value of which is the index number of the Type_ids
- Typeidx
The field's type, and its value is also the index number of the Type_ids
- Nameidx
The name of the field, the value of which is the index number of the String_ids
Such as
Method_ids
This area has an index of all methods of the Dex file, similar in format to Field_ids, with each size 12 bytes. Its data structure is
/* * Direct-mapped "method_id_item". */struct DexMethodId { u2 classIdx; /* index into typeIds list for defining class */ u2 protoIdx; /* index into protoIds for method prototype */ u4 nameIdx; /* index into stringIds for method name */};
- Classidx
The class to which the method belongs, the value of which is the index number of the Type_ids
- Protoidx
The prototype of the method, the value of which is the index number of the Proto_ids
- Nameidx
The name of the method, the value of which is the index number of the String_ids
Such as
Data area Class_def
Stores information about each class of the Dex file, each of which is 32 bytes in the data structure of
/* * Direct-mapped "Class_ Def_item ". */ struct dexclassdef {U4 classidx; /* index into Typeids for this class */ U4 accessflags; U4 Superclassidx; /* index into typeids for superclass */ U4 Interfacesoff; /* file offset to Dextypelist */ U4 sourcefileidx; /* index into stringids for source file name */ U4 Annotationsoff; /* file offset to Annotations_directory_item */ U4 Classdataoff; /* file offset to Class_data_item */ U4 Staticvaluesoff; /* file offset to Dexencodedarray */};
- Classidx
The type of this class, the value of index Type_ids
- AccessFlags
Qccess flag, indicating public, private, etc.
- Superclassidx
The type of the parent class with the value Type_ids index
- Interfacesoff
The offset address of the dextypelist, which represents the interface owned by this class, if not, a value of 0
- Sourcefileidx
Source file name, index value of String_ids
- Annotationsoff
The offset address of the Annotations_directory_item, in the data area, that represents the comment, if there is no comment, the value is 0
- Classdataoff
The offset address of the Class_data_item, in the data area, represents the details of a class, including field, method, method, code, and so on, which will be described in detail later
- Staticvaluesoff
The value is the offset address, which points to the static field of the class
Such as
Class_data_item
The Classdataoff field of the dexclassdef indicates the address of the Class_data_item in the Dex file, the data structure that is read into the memory is dexclassdata, and in the DexClass.h file
/* expanded form of class_data_item. Note: If a particular item is * absent (e.g., no static fields), then the corresponding pointer * is set to NULL. */struct DexClassData { DexClassDataHeader header;//记录staticFields、instanceFields、directMethods、virtualMethods的size DexField* staticFields;//类的static域 DexField* instanceFields;//类的实例域 DexMethod* directMethods;//类的方法 DexMethod* virtualMethods;//类的virtual方法};
Dexclassdataheader, Dexfield, DEXMETHOD data structures are as follows
/* expanded form of a class_data_item header */struct DexClassDataHeader { u4 staticFieldsSize; u4 instanceFieldsSize; u4 directMethodsSize; u4 virtualMethodsSize;};/* expanded form of encoded_field */struct DexField { u4 fieldIdx; /* index to a field_id_item */ u4 accessFlags;};/* expanded form of encoded_method */struct DexMethod { u4 methodIdx; /* index to a method_id_item */ u4 accessFlags; u4 codeOff; /* file offset to a code_item */};
The FIELDIDX value for the field_ids of the index area is the INDEX,METHODIDX value of the index area of the Method_ids, with a focus on Codeoff, with a value of offset address, indicating Code_item
Code_item
The
Code_item run-time related information for the method of the record class, and its data structure is declared in DexFile.h:
struct DexCode {u2 Registerssize; //registers U2 inssize; //the number of input parameters U2 outssize; //This section of code calls the parameters required by other methods U2 triessize; //try the number of item structures, U4 Debuginfooff; /* file offset to debug Info stream */ U4 insnssize; /* instruction list size, in 16-bit units */ U2 insns[1 ];//Instruction (byte code) /* followed by optional U2 padding */ /* followed by try_item[triessize] */ /* followed by uleb128 handle Rssize */ /* followed by catch_handler_item[handlerssize] */};
- Insnssize and Insns[1]
Insnssize indicates the number of instructions, insns[1] is the actual instruction, size is not 1, but insnssize
- Try_item and Catch_handler_item
There may also be try_item and Catch_handler_item behind Insns[1], these 2 exceptions to catch Java, common Java code with try Catch
Such as
Data
Stores the actual contents of string constants, bytecode, Map_item, etc.
Appendix Test for Dex's Java code
Foo.java:
class Foo { publicstaticvoidmain(String[] args) { System.out.println("Hello, world"); }}
Resources
http://bbs.pediy.com/showthread.php?t=184761
Dalvik Virtual Machine "2"--dex file format