One Dalvik instruction format 1.1-bit description
The Dalvik assembly code consists of the Dalvik instruction, which is determined by the instruction's bit description and the instruction format identification. The conventions for bit descriptions are as follows:
- Each 16-digit word is separated by a space;
- Each letter represents four bits, and each character sequence starts with a high byte, arranges to a low byte, and may use "|" between every four bits To represent a different content.
- The order uses a single capital letter of A~z as a 4-bit opcode, op represents a 8-bit opcode;
- “?” Indicates that all bits in this field are 0 values.
examples such as the following directives:
A|G|op BBBB F|E|D|C
- Two spaces: Each divided part has 16 bits, a total of 3 groups;
- a| G|op: The high 8 bits are composed of a and g, and the low byte is composed of op opcode;
- BBBB: A 16-bit offset value;
- F| e| d| C: 4 bytes, representing the register parameters.
1.2 Instruction Format Identification
Using a bit ID alone cannot determine an instruction and must specify the format encoding through the instruction format identifier, which has the following conventions:
- Instruction format is mostly composed of three characters, the first two are numbers, the last one is letters;
- The first number indicates how many 16-bit words the instruction consists of;
- The second number indicates the maximum number of registers used by the instruction, and the special mark "R" identifies a range of registers.
- The third letter is a type code that represents the type of extra data used by the instruction. The specific meaning of this type code is shown in the following table:
1.3 Syntax Conventions
- Each instruction starts from the operation code, followed by parameters, the number of parameters is variable, each parameter is separated by commas;
- The parameters of each instruction start from the first part of the instruction, OP is in the lower 8 bits, the high 8 bits can be a 8-bit parameter, it can be two 4-bit parameters, it can also be null, if the instruction exceeds 16 bits, then the latter part is followed by the parameter;
- If the parameter is represented by "VX", it indicates that it is a register, such as V0, V1, etc., the register name of ARM architecture begins with "R";
- If the parameter is represented by "#+x", it indicates that it is a constant number;
- If the parameter is represented by "+x", it indicates that she is an address offset of a relative instruction;
- If the parameter is represented as "[email protected]", it means that it is a constant pool index value, where kind represents a constant pool type: string (constant pool index), type (constant pool index), Field (constant pool index of fields) and meth (method constant pool index).
examples , such as the following directives
op vAA, [email protected]
1 registers vaa,1 A string constant pool index.
Two Dalvik instruction set
The syntax and mnemonics of the Dalvik directive have the following characteristics:
- Parameters are used from target to source;
- Depending on the size and type of bytecode, some bytecode adds a name suffix to disambiguate, 32-bit regular type bytecode does not add any suffixes, 64-bit type bytecode adds the-wide suffix, and special type bytecode adds suffixes based on specific types, which can be-boolean,-byte,- Char,-short,-int,-long,-float,-double,-object,-string,-class and-void;
- Depending on the layout and options of the bytecode, some bytecode suffixes are added to disambiguate, which is separated by adding "/" to the main name of the bytecode;
- In the description of the instruction set, each letter in the width value represents a width of 4 bits.
examples such as the following directives:
move-wide/from16 vAA,vBBBB
- Move: The underlying bytecode that identifies this is the underlying operation
- -wide: Name suffix, which identifies the data width of the instruction operation (64 bits)
- From: Byte code suffix, which identifies a 16-bit register reference variable
- VAAA: The destination register, which is always in front of the source, with a range of values v0~v255
- VBBBB: Source register, Value range v0~v65535
2.1 Null Operation instructions
mnemonic : NOP
directive : Alignment code, no actual operation.
2.2 Data Manipulation Instructions
mnemonic : Move
directive : The move instruction will vary according to the size and type of the bytecode, followed by a different suffix
move destination,source
Example
move/from16 vAA,vBBBB;将vBBBB寄存器的值赋给vAA寄存器,源寄存器和目的寄存器都为16位move-wide vA,vB;4位寄存器对赋值,源寄存器与目的寄存器都为4位move-result vAA;将上一个invoke类型指令操作的双字非对象结果赋值给vAA寄存器move-object vA,vB;为对象赋值,源寄存器与目的寄存器都为4位move-exception vAA;保存一个运行时发生的异常到vAA寄存器
2.3 Return Instruction
Mnemonic : Return
directive : It has a type of 4
return-void;表示函数从一个void方法返回return-vAA;表示函数返回一个32位非对象类型的值,返回寄存器为6位jicunqivAAreturn-wide vAA;表示函数返回一个62位非对象类型的值,返回值为8位的寄存器vAAreturn-object vAA;表示函数返回一个对象类型的值,返回值为8位的寄存器vAA
2.4 Data definition Directives
mnemonic : const
directives : Data definition directives are used to define data such as constants, strings, and classes in a program.
const/4 vA,#+B;将数值符号扩展为32位后赋值给寄存器vAconst/16 vAA,#+BBBB;将数值符号扩展为32位后赋给寄存器vAAconst/high16 vAA,#+BBBB0000;将数值右边零扩展为32位后赋给寄存器vAAconst-wide/16 vAA,#+BBBB;将数值符号扩展为64位后赋值给寄存器vAAconst-string vAA,[email protected];通过字符串索引构造一个字符串并赋值给寄存器vAAconst-class vAA,[email protected];通过类型索引获取一个类引用并赋给寄存器vAAconst-class/jumbo vAAAA,[email protected];通过给定类型的索引获取一个类引用并赋给寄存器vAAAA,这条指令占用两个字节,值为0x00ff
2.5 Lock Command
mnemonic : Monitor
directive : The lock instruction is used in multi-threaded programs for the same object operation.
monitor-enter vAA;为指定对象获取锁monitor-exit vAA;释放指定对象的锁
2.6 Instance Operation instructions
mnemonics : Check-cast, instace-of, new-instance
directives : Type conversions, checks, and updates for instances.
check-cast vAA,[email protected];将寄存器vAA中的对象引用转换成指定的类型,如果失败则抛出ClassCastException,如果类型B指定的是基本类型,对于非基本类型A来说,运行时始终会失败instance-of vA,vB,[email protected];判断vB寄存器中的对象引用是否可以转换成指定的类型,如果可以vA的寄存器赋值为1,否则赋值为0new-instance vAA,[email protected];构造一个指定类型对象的新实例,并将对象引用赋值给vAA寄存器,类型符type指定的类型不能是数组类
2.7 Array Operation instructions
mnemonics : Array-length, New-array, Filed-new-array, Fill-array-data, ARRAYOP
instruction : Take array length, new array, array assignment, array element value and assignment.
array-length vA,vB;获取给定vB寄存器中数组的长度并赋值给vA寄存器,数组长度指的是数组的条目个数new-array vA,vB,[email protected];构造指定类型([email protected])与大小(vB)的数组,并将值赋值给vA寄存器filled-new-array {vC,vD,vE,vF,vG},[email protected];构造指定类型([email protected])与大小(vA)的数组并填充数组内容,vA寄存器是隐含使用的,除了指定数组的大小外还指定了参数的个数,vC~vG是使用到的参数寄存器序列
2.8 Exception Instructions
Mnemonic : Throw
directives : Throwing Exceptions
throw vA;抛出vA寄存器中指定类型的异常
2.9 Jump Instructions
mnemonics : Goto, Packed-switch, Sparse-switch, If-test, If-testz
instruction : Jumps from the current address to the specified offset.
goto+AA;无条件跳转到指定偏移处,偏移量AA不能为0packed-switch vAA,+BBBBBBBB;分支跳转指令,vAA寄存器为switch分支需要判断的值,BBBBBBBBB指向一个packed-swithc-payload格式的偏移表,表中的值是按规律递增的sparse-switch vAA,+BBBBBBBB;分支跳转指令,vAA寄存器为switch分支需要判断的值,BBBBBBBBB指向一个sparse-swithc-payload格式的偏移表,表中的值无规律的偏移量if-test vA,vB,+CCCC;条件跳转指令,比较vA寄存器和vB寄存器的值,如果比较结果满足就跳转到CCCC指定的偏移处,偏移量CCCC不能为0if-eq vA,vB,+CCCC;if(vA==vB)则跳转if-ne vA,vB,+CCCC;if(vA!=vB)则跳转if-lt vA,vB,+CCCC;if(vA<vB)则跳转if-ge vA,vB,+CCCC;if(vA>=vB)则跳转if-gt vA,vB,+CCCC;if(vA>vB)则跳转if-le vA,vB,+CCCC;if(vA<=vB)则跳转if-testz vA,+CCCC;条件跳转指令,比较vA寄存器和0比较,如果比较结果满足或值为0就跳转到CCCC指定的偏移处,偏移量CCCC不能为0if-eqz vA,+CCCC;if(!vA)则跳转if-nez vA,+CCCC;if(vA)则跳转if-ltz vA,+CCCC;if(vA<0)则跳转if-gez vA,+CCCC;if(vA>=0)则跳转if-gtz vA,+CCCC;if(vA>0)则跳转if-lez vA,+CCCC;if(vA<=0)则跳转
2.10 Comparison Instructions
mnemonics : Cmpl-float, Cmpg-float, cmpl-double, cmpg-double and Cmp-log
directive : Compare the values of two registers
cmpl-float vAA,vBB,vCC;比较两个单精度浮点数,vBB>vCC,则结果为-1,vBB==vCC,则结果为0,vBB<vCC,则结果为1,最终结果存放到vAA中cmpg-float vAA,vBB,vCC;比较两个单精度浮点数,vBB>vCC,则结果为1,vBB==vCC,则结果为0,vBB<vCC,则结果为-1,最终结果存放到vAA中cmpl-double vAA,vBB,vCC;比较两个双精度浮点数,vBB>vCC,则结果为-1,vBB==vCC,则结果为0,vBB<vCC,则结果为1,最终结果存放到vAA中cmpg-double vAA,vBB,vCC;比较两个双精度浮点数,vBB>vCC,则结果为1,vBB==vCC,则结果为0,vBB<vCC,则结果为-1,最终结果存放到vAA中cmp-long vAA,vBB,vCC;比较两个长整型数,vBB>vCC,则结果为1,vBB==vCC,则结果为0,vBB<vCC,则结果为-1,最终结果存放到vAA中
2.11 Field Operation Instructions
mnemonics : Iinstaceop, Sstaticop
Directives : Field operations directives are used to enter read and write operations on the fields of an object instance, and the type of the field can be any valid data type in Java.
1 General field Directives
Iget, Iget-wide, Iget-object, etc.
iinstanceop vA,vB,[email protected]
2 static field directives
Sget, Sget-wide, Sget-object, etc.
`sstaicop vA,vB,[email protected]
2.11 Method Invocation Directives
Mnemonic : Invoke
directive : The method that invokes the instance.
invoke-virtual;调用实例的虚方法invoke-super;调用实例的父类方法invoke-direct;调用实例的直接方法invoke-static;调用实例的静态方法invoke-interface;调用实例的接口方法
The following methods do not differ from the above, except that the latter uses a range to specify the range of registers when setting the parameter register.
invoke-virtual/range;调用实例的虚方法invoke-super/range;调用实例的父类方法invoke-direct/range;调用实例的直接方法invoke-static/range;调用实例的静态方法invoke-interface/range;调用实例的接口方法
The return value of the method invocation instruction needs to be obtained with the "move-result" directive:
invoke-static {}, Landroid/os/Parcel:->obtain()Landroid/Parcel;move-result-object v0
2.12 Data Conversion Instructions
mnemonic : Unop
directives : Used to convert one type of data to another data type.
neg-int;整数求补not-int;整数求反long-to-int;长整型数转换为整型数
2.13 Data Operation Instructions
mnemonics : Add, Sub, mul, Div, REM, and, or, XOR, SHL, SHR, ushr
instruction : arithmetic operations: Add, subtract, multiply, divide, modulo, shift. Logical operations: With, or, not, or.
Extensions such as suffixes 2addr, lit16, and LIT8 can also be added after the add operator.
add-type vAA,vBB,vCC;vBB + vCC将结果保存到vAAsub-type vAA,vBB,vCC;vBB - vCC将结果保存到vAAmul-type vAA,vBB,vCC;vBB x vCC将结果保存到vAAdiv-type vAA,vBB,vCC;vBB / vCC将结果保存到vAArem-type vAA,vBB,vCC;vBB % vCC将结果保存到vAAand-type vAA,vBB,vCC;vBB AND vCC将结果保存到vAAor-type vAA,vBB,vCC;vBB OR vCC将结果保存到vAAxor-type vAA,vBB,vCC;vBB XOR vCC将结果保存到vAAshl-type vAA,vBB,vCC;vBB << vCC将结果保存到vAAshr-type vAA,vBB,vCC;vBB >> vCC将结果保存到vAAushr-type vAA,vBB,vCC;vBB >> vCC将结果保存到vAA
"Android Dalvik virtual machine studious easy-to-use series" of the second: Dalvik assembly language