ADBI Learning: So hook implementation mechanism

Source: Internet
Author: User

In this article we look at the implementation of the principle of adbi, in fact, the knowledge point in front of almost all involved, not much new knowledge. ADBI uses the hijack program to inject libexample.so into the specified process and loads the libexample.so in the process, and libexample.so executes it during the load process. Init_array section Code, Implement the function hook in the code (replace the original function with a custom function). The function hook is automatically implemented by running hijack.

 Hijack Flowchart:

1. Get the Mprotect () function address under the PID process

2. Get the Dlopen () function address under the PIID process

3. Use ptrace attach PID process, and get regs and save

4. Push the SC structure into the stack space of the PIN process; SC contains absolute paths for specific directives , Regs, Mprotect, Dlopen, and libexample.so

5. Modify the Regs value and set it to the pin process, then Ptrace_detach release the PID process so that it can continue to run; At this point hijack is all done

6. At this point the PID process will execute the specific instructions that were previously pressed into the stack:

There are 2 points of knowledge involved:

1. How to get the specified fun function address under the PID process:

Assuming fun in lib.so, first get the address of this process lib.so addrlib (how to get loaded so library address:/proc/pid/maps Storage shared library memory address), and then get this process fun address addrfun, Then get the addrpidlib address of the pin process, then the fun address =addrfun-addrlib+addrpidlib under the PID, and the other way is to get dynsym and Dynstr sections according to the dynamic section of So, Use Dynsym->name as the dynstr character to get the string str1 and fun comparison, you can get the address of fun (specific reference to the previous article: ELF Format and linker source analysis).

2. What exactly is a given instruction and what does it do? The instructions are as follows:

The specific instruction uses Mprotect to change the page read and write permission,dlopen loads the libexample.so, and restores the PID process state before attach with the saved regs. As for the details of these instructions see reference 1, this is not the start. Need to mention is arm_pc this register, in hijack modified the PID process of the PC register, so that the PID process in the detach directly after the push into the stack of specific instructions. But we know that the PC is the address of the command, the ARM architecture in the execution of instructions and instructions are also decoded, how to change the value of the PC directly to execute the address of the PC instructions? Online is that after modifying the PC value, the previous pipeline (take instruction, decoding, execution ) discarded, and then from the PC to start taking instructions, decoding, execution. Did not find arm official information, Hope Insider inform!

  libexample Execution Process :

1. Get the hook function Hookedfun How to get the address of the function See Method 1 above

2. Modify the Hookedfun function instruction before a few instructions for a specific instruction, so that the Hookedfun function is replaced with a custom function

3. When the process executes the Hookedfun function, it executes the custom function instead of the Hookedfun function; Modify the Hookedfun function instruction again to unload the function hook

The main point here is how to replace the assembly instructions involved in Hookedfun. Arm is divided into arm instructions and thum instructions, in the ADBI so judged:

if 4 0 ) {    arm instruction}else  {     thumb instruction}

Then the arm instruction involves the substitution function assembly instruction:

H->patch = (unsignedint) Hook_arm;//Address of the custom functionH->orig = addr;address of the//hookedfun functionh->jump[0] =0xe59ff000;//LDR pc, [pc, #0]h->jump[1] = h->patch;//Address of the custom functionh->jump[2] = h->patch;//Address of the Custom function for(i =0; I <3; i++) H->store[i] = ((int*) h->orig) [i];//Save instructions for Hookedfun function for(i =0; I <3; i++)    ((int*) h->orig) [I] = h->jump[i];//Modify the instruction of the Hookedfun function, when calling Hookedfun, execute h->jump[0]

look at the code above, the execution of Hookedfun is actually performed Ldr pc,[pc, #0]. We know that the PC value is the currently executing instruction +8, i.e. the PC value is h->jump[2] = the value of the custom function. OK, that's the equivalent of a jump to execute a custom function. Here's a look at the thumb assembly instructions:

if((unsignedLong int) Hook_thumb%4==0) Log ("warning Hook is not thumb 0x%lx\n", (unsignedLong) hook_thumb) H->thumb =1; Log ("THUMB using 0x%lx\n", (unsignedLong) hook_thumb) H->patch = (unsignedint) Hook_thumb; H->orig =addr; H->jumpt[1] =0xb4; H->jumpt[0] =0x60;//Push {R5,R6}h->jumpt[3] =0xa5; H->jumpt[2] =0x03;//add R5, PC, #12; here R5 is essentially pointing to jumpt[18] then why do you say that it executes the custom function address junpt[16]? h->jumpt[5] =0x68; H->jumpt[4] =0x2d;//Ldr R5, [R5]h->jumpt[7] =0xb0; H->jumpt[6] =0x02;//add sp,sp, #8h->jumpt[9] =0xb4; H->jumpt[8] =0x20;//push {R5}h->jumpt[ One] =0xb0; H->jumpt[Ten] =0x81;//Sub sp,sp, #4h->jumpt[ -] =0xbd; H->jumpt[ A] =0x20;//Pop {r5, PC}h->jumpt[ the] =0x46; H->jumpt[ -] =0xaf;//mov pc, R5; just to pad to 4 byte boundarymemcpy (&h->jumpt[ -], (unsignedChar*) &h->patch,sizeof(unsignedint)); Save the Custom function address to jumpt[16]--jumpt[19] unsignedintOrig = addr-1;//Sub 1 to get real address
Note that here minus 1, the function of the thumb is compiled after the function symbol address will be in the real address +1, this is to identify the thumb function or arm function, arm function 4 byte alignment lowest bit is always 0 for(i =0; I < -; i++) {h->storet[i] = ((unsignedChar*) orig) [i]; //log ("%0.2x", H->storet[i]) } //log ("\ n") for(i =0; I < -; i++) {(unsignedChar*) orig) [I] = h->Jumpt[i]; //log ("%0.2x", ((unsigned char*) orig) [i])}

using the stack of push and pop to save the custom function address (jump[16]) R5 assignment to the PC, the specific principle of reference 2. But here's the thing to watch out for.

It is also important to note that for the thumb "Add Rd, RP, #expr" instruction, if the Rp is a PC register, then the value of the PC register read should be (current instruction address +4) & 0xFFFFFFFC, that is, remove the last two bits, You can subtract 2 from the calculation. But there is also an assumption that the start address of the hook function must be 4-byte aligned, even if the hook function is written using the thumb instruction set.

It is said that when the hook function 4 byte alignment, add R5, PC, #12这条指令的地址刚好是2字节对齐, then according to the above paragraph can be reduced by 2 R5 point is jumpt[16] instead of jumpt[18], so this hook function is required. After the instruction of the replacement function is complete, unloading the hook naturally becomes clear-restoring the original changed command is OK. We changed the effect of the function instruction to the replacement function. But the processing contains the instruction cache, if executed from the cache, then I will refresh the cache; Here we use the system call number to execute:

voidInline Hook_cacheflush (unsignedintBegin, unsignedintend) {        Const intSyscall =0xf0002; __asm __volatile ("mov r0,%0\n"                    "mov r1,%1\n"        "mov r7,%2\n"        "mov r2, #0x0 \ n"        "Svc 0x00000000\n"        :        :    "R"(begin),"R"(end),"R"(syscall):"R0","R1","R7"        );}

r0=begin,r1=end,r7=0xf0002 (Cacheflush system call number), Direct SVC to execute system call.

Hijack and libexample flow that way, but how from hijack to libexample ah?

// This file was going to being compiled into a thumb mode binary // Here's the point, linker executes this function when the process first opens the Lib Operation void __attribute__ ((constructor)) My_init (void);

  Do you remember the instruction that the PID process executes after the hijack is executed, and it executes Dlopen (libexample), and then goes back to the. Init_array command in Libexample and adds "__attribute__" ( constructor) "My_init is in the. Init_array. You know, execute my_init in Dlopen, and replace the function with hook in my_init. Of course my_init functions can be different, but there must be functions in. init_array to perform function substitution functions (call hooks () for adbi).

finally we look at Libinject:

1. Using Ptrace to inject PID process

2. Get the function address in the PID process, using the first method above: Fun address =addrfun-addrlib+addrpidlib

3.libinject does not inject so does not implement a function hook, it implements the function of hijack, but it is not only Dlopen also executes a custom function here (the same way as ADBI)

  In the next chapter, we'll look at how ADBI implements Dalvik hooks.

Resources:

1 Research on the adbi of the hook frame on Android platform (i)

2 research on the adbi of the hook frame under the Android platform (bottom)

ADBI Learning: So hook implementation mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.