On an article we talked about APK to prevent anti-compilation technology in the shell technology, if there is not clear can view my previous blog http://my.oschina.net/u/2323218/blog/393372. Next we will introduce another technique to prevent the apk from recompiling-modifying bytecode at runtime. This method is in the work in the implementation of the app wrapping, see a foreign article about the security of Android implementation and original. Let's take a look at this approach.
We know that all Java-generated class files generated by the APK are integrated into a classes.dex file by the DX command, and when the APK is run, the Dalvik VM loads the Classes.dex file and further optimizes the Odex file with the dexopt command. Our approach is to modify the Dalvik directive in this process to achieve our goal.
One, DEX file format
Dex's file format typically consists of 7 main sections and data regions in the following format:
The header section records the main information the other part is just the index, and the contents of the index exist in the data area.
The header section is structured as follows:
Field name |
Offset value |
Length |
Describe |
Magic |
0x0 |
8 |
The ' magic ' value, the Magic Number field, is formatted as "dex/n035/0", where 035 represents the version of the structure. |
Checksum |
0x8 |
4 |
Check code. |
Signature |
0xC |
20 |
SHA-1 signature. |
File_size |
0x20 |
4 |
The total length of the Dex file. |
Header_size |
0x24 |
4 |
File header length, 009 version =0x5c,035 version =0x70. |
Endian_tag |
0x28 |
4 |
A constant that identifies the byte order, according to which you can tell if the file swapped byte order, by default =0x78563412. |
Link_size |
0x2C |
4 |
The size of the connection segment, if 0, represents a static connection. |
Link_off |
0x30 |
4 |
The starting position of the connection segment, starting from the beginning of the file. If the connection segment size is 0, this is also 0. |
Map_off |
0x34 |
4 |
Map data Base Address. |
String_ids_size |
0x38 |
4 |
The number of strings in the string list. |
String_ids_off |
0x3C |
4 |
The String list table base address. |
Type_ids_size |
0x40 |
4 |
The number of type list types. |
Type_ids_off |
0x44 |
4 |
The base address of the type list. |
Proto_ids_size |
0x48 |
4 |
Prototype list number of prototypes. |
Proto_ids_off |
0x4C |
4 |
Base address of the prototype list. |
Field_ids_size |
0x50 |
4 |
The number of fields in the list of fields. |
Field_ids_off |
0x54 |
4 |
The base address of the field list. |
Method_ids_size |
0x58 |
4 |
Method lists the number of methods in the table. |
Method_ids_off |
0x5C |
4 |
The base address of the method list. |
Class_defs_size |
0x60 |
4 |
class defines the number of classes in the class table. |
Class_defs_off |
0x64 |
4 |
class defines the base address of the list. |
Data_size |
0x68 |
4 |
The size of the data segment, which must be aligned with 4 bytes. |
Data_off |
0x6c |
4 |
Data segment Base Address |
One of the advantages of Dex compared to class files is that all of the constant string sets are managed uniformly so that redundancy can be reduced, and the final Dex file size can become smaller. Detailed Dex file Introduction will not say, interested can view the Android source Dalvik/docs directory under the dex-format.html file is described in detail. But I remember this file was not available after the android4.0 version.
Depending on the format structure of the Dex file above, the Dalvik virtual machine running the Dex file executes the bytecode that exists within the Method_ids area. We view Dalvik virtual machine source code will have a
struct Dexcode {
U2 registerssize;
U2 inssize;
U2 outssize;
U2 triessize;
U4 Debuginfooff; /* file offset to debug info stream */
U4 Insnssize; /* Size of the Insns array, in U2 units */
U2 insns[1];
/* followed by optional U2 padding */
/* followed by try_item[triessize] */
/* followed by uleb128 handlerssize */
/* followed by catch_handler_item[handlerssize] */
};
Such a structure, where the Insns array holds the Dalvik bytecode. As long as we locate the Dexcode data segment of the related class method, we can modify the Insns array to achieve our goal.
Ii.. odex file format
When the APK is installed or started, Dex is generated by dexopt to generate an optimized Odex file. The process is to unzip the Classes.dex in the APK and use dexopt to process and save it as/data/dalvik-cache/[email protected]@<package-name>[email Protected] file.
The Odex file structure is as follows:
From where we found the Dex file as part of the optimized odex, we just need to find the Dex part from the Odex.
Third, the method realization
To implement a modified bytecode, you first need to locate the location where you want to modify the code, which requires parsing the Dex file first. The parsing of the Dex file gives us a concrete implementation of the Dalvik source code dexDump.cpp, and we can find the classes and methods we need based on its implementation. The specific implementation steps are as follows:
(1) Locate our APK generated Odex file and get the mapped address and size of the Odex file in memory. The implementation code is as follows:
?
1234567891011121314151617181920212223 |
void *base = NULL;
int
module_size = 0;
char
filename[512];
// simple test code here!
for
(
int
i=0; i<2; i++){
sprintf
(filename,
"/data/dalvik-cache/[email protected]@%s-%[email protected]"
,
"com.android.dex"
, i+1);
base = get_module_base(-1, filename);
//获得odex文件在内存中的映射地址
if
(base != NULL){
break
;
}
}
module_size = get_module_size(-1, filename);
//获得odex文件大小
|
(2) know that the Dex file is offset in Odex in order to parse the Dex file. The code is as follows:
?
1234567891011 |
// search dex from odex
void
*dexBase = searchDexStart(base);
if
(checkDexMagic(dexBase) ==
false
){
ALOGE(
"Error! invalid dex format at: %p"
, dexBase);
return
;
}
|
(3) After finding the Dex offset, you can parse the Dex file to find the class where we want to replace the method, and then find the method in that class and return the Dexcode struct that corresponds to the method. The function is implemented as follows:
?
12345678910111213141516171819 |
static
const
DexCode *dexFindClassMethod(DexFile *dexFile,
const
char
*clazz,
const
char *method)
{
DexClassData* classData = dexFindClassData(dexFile, clazz);
if
(classData == NULL)
return
NULL;
const
DexCode* code = dexFindMethodInsns(dexFile, classData, method);
if
(code != NULL) {
dumpDexCode(code);
}
return
code;
}
|
(4) After finding the Dexcode, you can replace the instruction. The implementation is as follows:
?
123456789101112131415161718192021222324252627 |
const
DexCode *code =
dexFindClassMethod(&gDexFile,
"Lcom/android/dex/myclass;"
,
"setflagHidden"
);
const
DexCode*code2 =
dexFindClassMethod(&gDexFile,
"Lcom/android/dex/myclass;"
,
"setflag"
);
// remap!!!!
if
(mprotect(base, module_size, PROT_READ | PROT_WRITE | PROT_EXEC) == 0){
DexCode *pCode = (DexCode *)code2;
// Modify!
pCode->registersSize = code->registersSize;
for
(u4 k=0; k<code->insnsSize; k++){
pCode->insns[k] = code->insns[k];
}
mprotect(base, module_size, PROT_READ | PROT_EXEC); }
|
Note: Because the Dalvik instruction is modified at run time, this is the memory map of the process is read-only, so calls to the Mprotect function will be called read-only for the instruction to be modified.
According to the above, I believe we have a certain understanding of the operation of the technology to modify bytecode, the next one we will explain another Android APK to prevent anti-compilation technology, look forward to everyone's support. If you have any questions about this technology and want to get the engineering source of the technology that this article speaks about
Welcome to the personal public platform : Programmer Interaction Alliance (Coder_online) , sweep the QR code or search number below Coder_online can follow , we can communicate online.
Android APK Prevent anti-compilation technology second-run-time modify Dalvik directive