Android-based Elf Plt/got symbol redirection process and elf hook implementation

Source: Internet
Author: User
Tags goto

Android-based Elf Plt/got symbol redirection process and elf hook implementation--by Low-end code farm 2014.10.27 Introduction

There are two main reasons for writing this technical article:

    • One is to find that most of the articles on the Web describing the Plt/got symbol redirection process are aimed at x86, such as "redirecting functions in shared ELF libraries" is very well written. Although the process is very similar to arm, but due to the different CPU system, the implementation of the instructions is very large;
    • The second is the introduction of most of the online elf file format, based on the link view (linking view), the link view is based on sections (section) of the ELF parsing. However, in the process of loading the dynamic link library, linker only focuses on the segment (Segment) information in the ELF. Therefore, the section information in the elf is completely tampered with or even deleted, and does not affect the linker loading process, which can prevent static analysis tools (such as ida,readelf, etc.) to analyze it, generally add shell of the elf file will have this aspect of processing. For this elf file, if you want to implement the hook function, you must be based on the execution view (execution view) for symbolic resolution;
Get ready

Before reading down, make sure you have a general understanding of the Elf file format and arm assembly, and refer to the guide:

    • ELF file format analysis;
    • ARM documentation;

Preparation tools:

    • Readelf (NDK included)
    • Objdump (NDK included)
    • IDA Pro 6.4 or above
    • Android Real Machine or simulator
Symbol redirection

On ARM, there are three main types of redirects, namely R_arm_jump_slot,r_arm_abs32 and R_arm_glob_dat, and we want to hook the ELF function, You need to handle these three types of redirection at the same time.

Example

Look at the sample code first

typedef int (*strlen_fun)(const char *);strlen_fun global_strlen1 = (strlen_fun)strlen;strlen_fun global_strlen2 = (strlen_fun)strlen;#define SHOW(x) LOGI("%s is %d", #x, x)extern "C" jint Java_com_example_allhookinone_HookUtils_elfhook(JNIEnv *env, jobject thiz){    const char *str = "helloworld";    strlen_fun local_strlen1 = (strlen_fun)strlen;    strlen_fun local_strlen2 = (strlen_fun)strlen;    int len0 = global_strlen1(str);    int len1 = global_strlen2(str);    int len2 = local_strlen1(str);    int len3 = local_strlen2(str);    int len4 = strlen(str);    int len5 = strlen(str);    SHOW(len0);    SHOW(len1);    SHOW(len2);    SHOW(len3);    SHOW(len4);    SHOW(len5);    return 0;}

This code calls strlen in three different ways, namely, global function pointers, local function pointers, and direct calls, and we analyze three invocation analyses, respectively, for this example.

First, with Readelf, let's look at the redirect table as follows:

Relocation section '. Rel.dyn ' at offset 0x2a48 contains entries:offset Info Type sym.value Sym. Nam E0000ade0 00000017 r_arm_relative 0000af00 00000017 r_arm_relative 0000af0c 00000017 R_ARM_RELATIVE 0000af10 000 00017 r_arm_relative 0000af18 00000017 r_arm_relative 0000af1c 00000017 r_arm_relative 0000af20 00000017 R_ARM_RE lative 0000af24 00000017 r_arm_relative 0000af28 00000017 r_arm_relative 0000af30 00000017 R_ARM_RELATIVE 0000a EFC 00003215 R_arm_glob_dat 00000000 __stack_chk_guard0000af04 00003715 R_arm_glob_dat 00000000 __page_size000 0af08 00004e15 R_arm_glob_dat 00000000 strlen0000b004 00004e02 R_arm_abs32 00000000 strlen0000b008 00004e0 2 R_arm_abs32 00000000 Strlen0000af14 00006615 R_arm_glob_dat 00000000 __gnu_unwind_find_exid0000af2c 00007 415 R_arm_glob_dat 00000000 __cxa_call_unexpected ...   Relocation section '. Rel.plt ' at offset 0x2ad0 contains entries:offset Info Type sym.value Sym. Name0000af40 00000216 R_arm_jump_slot 00000000 __cxa_atexit0000af44 00000116 R_ARM_ Jump_slot 00000000 __cxa_finalize0000af48 00001716 R_arm_jump_slot 00000000 memcpy ... 0000afd4 00004c16 R_arm_jump_slot 00000000 fgets0000afd8 00004d16 R_arm_jump_slot 00000000 FCLOSE0000AFDC 00004 E16 R_arm_jump_slot 00000000 strlen0000afe0 00004f16 R_arm_jump_slot 00000000 strncmp ...

In the two sections of. Rel.plt and. Rel.dyn, we found that there were altogether 4 strlen, and we first recorded their key information, which is very useful later on. Each of them is

. Rel.dyn 0000af08 R_arm_glob_dat

. Rel.dyn 0000b004 R_ARM_ABS32.rel.dyn 0000b008 R_ARM_ABS32.rel.plt 0000AFDC r_arm_jump_slot

In the code, we called 6 strlen, but why did it only occur 4 times? In addition, how do they correspond to each other, with these questions to analyze the assembly code. To drag the compiled so to Ida, we see the instructions for the sample code:

. TEXT:000050BC EXPORT JAVA_COM_EXAMPLE_ALLHOOKINONE_HOOKUTILS_ELFHOOK.TEXT:000050BC Java_com_example_allho OKINONE_HOOKUTILS_ELFHOOK.TEXT:000050BC.TEXT:000050BC var_40 = -0X40.TEXT:000050BC var_38 = -0x38.text:0 00050BC var_34 = -0X34.TEXT:000050BC s = -0X2C.TEXT:000050BC var_28 = -0X28.TEXT:000050BC          var_24 = -0X24.TEXT:000050BC Var_20 = -0X20.TEXT:000050BC var_1c = -0X1C.TEXT:000050BC var_18           = -0X18.TEXT:000050BC Var_14 = -0X14.TEXT:000050BC var_10 = -0X10.TEXT:000050BC var_c = -0XC.TEXT:000050BC.TEXT:000050BC PUSH {r4,lr}.text:000050be SUB S             P, SP, #0x38. text:000050c0 str R0, [SP, #0x40 +var_34].text:000050c2 str R1, [SP, #0x40 +var_38].text:000050c4 LDR R4, = (_global_offset_table_-0x50ca). text:000050  C6 ADD           R4, PC;                 _global_offset_table_.text:000050c8 LDR R3, = (AHELLOWORLD-0X50CE). Text:000050ca ADD R3, PC;             "HelloWorld". text:000050cc STR R3, [SP, #0x40 +S].TEXT:000050CE LDR R3, = (strlen_ptr-0xaf34). text:000050d0 LDR R3, [R4,R3];             __imp_strlen.text:000050d2 STR R3, [SP, #0x40 +var_28].text:000050d4 LDR R3, = (strlen_ptr-0xaf34). Text:000050d6 LDR R3, [R4,R3];             __imp_strlen.text:000050d8 STR R3, [SP, #0x40 +var_24].text:000050da LDR R3, = (global_strlen1_ptr-0xaf34). TEXT:000050DC LDR R3, [R4,R3]; Global_strlen1.text:000050de Ldr R3, [R3].text:000050e0 Ldr R2, [S P, #0x40 +s].text:000050e2 MOVS R0,R2.text:000050e4 BLX r3.text:000050e6 MOVS R3, R0.text:000050e8 STR R3, [SP, #0x40 +var_20].text:000050ea LDR R3, = (global_strlen2_ptr -0xaf34). Text:000050ec LDR R3, [R4,R3]; Global_strlen2.text:000050ee Ldr R3, [r3].text:000050f0 Ldr R2, [S P, #0x40 +s].text:000050f2 MOVS R0, R2.text:000050f4 BLX r3.text:0000 50f6 MOVS R3, R0.text:000050f8 STR R3, [SP, #0x40 +var_1c].text:00005 0FA Ldr R2, [sp, #0x40 +s].text:000050fc LDR R3, [sp, #0x40 +var_28].t                 Ext:000050fe MOVS R0, r2.text:00005100 BLX r3.text:00005102           MOVS R3, r0.text:00005104 STR  R3, [sp, #0x40 +var_18].text:00005106 Ldr R2, [sp, #0x40 +s].text:00005108 LDR             R3, [SP, #0x40 +var_24].text:0000510a MOVS R0, r2.text:0000510c BLX r3.text:0000510e MOVS R3, r0.text:00005110 STR R3, [SP, #0 x40+var_14].text:00005112 LDR R3, [SP, #0x40 +s].text:00005114 MOVS R 0, R3;                 s.text:00005116 BLX strlen.text:0000511a MOVS R3, r0.text:0000511c STR R3, [sp, #0x40 +var_10].text:0000511e LDR R3, [sp, #0x40 +s].text: 00005120 MOVS R0, R3; s.text:00005122 BLX strlen.text:00005126 MOVS R3, R0 ....             TEXT:000051CA ADD sp, SP, #0x38. text:000051cc    POP {r4,pc}.text:000051cc; End of function Java_com_example_allhookinone_hookutils_elfhook

Find out some important addresses first, they are

    • global_offset_table: 0x0000af34
    • Strlen_ptr:0x0000af08
    • __imp_strlen:0x0000b0c8
    • global_strlen1_ptr:0x0000af0c
    • global_strlen1:0x0000b004
    • Global_strlen2_ptr:0x0000af10
    • global_strlen2:0x0000b008
global function pointers Call external functions

Global_strlen1 and Global_strlen2 call, corresponding to 0x000050e4 and 0x000050f4 two BLX instructions, by calculating the final R3 values are *global_strlen1 and *global_strlen2, respectively, The values of global_strlen1 and global_strlen2 correspond exactly to the two R_ARM_ABS32 relocation items located in. Rel.dyn, so we conclude that the external function is called by means of a global function pointer, and its relocation type is R_arm _abs32, and is located in the. Rel.dyn section area .

We only analyze Global_strlen1 's invocation process, first to Global_strlen1_ptr (0X0000AF0C), which is located in the. Got section, above theglobal_offset_table . It then navigates through the Global_strlen1_ptr to 0x0000b004 (located in the. Data section) and finally to the final function address through 0x0000b004, so r_arm_ The offset of the ABS32 relocation item points to the address of the final calling function address (that is, the pointer to the function pointer), and the entire relocation process is preceded by a. Got, and then from. Got to. Date. Here is the 16-binary representation fragment of the. Got segment:

...0000AF0C  04 B0 00 00 08 B0 00 00  DC B0 00 00 B4 87 00 000000AF1C  F4 84 00 00 60 5B 00 00  58 5B 00 00 50 5B 00 000000AF2C  EC B0 00 00 FC 8C 00 00  00 00 00 00 00 00 00 00...0000B004  C8 B0 00 00 C8 B0 00 00  ?? ?? ?? ?? ?? ?? ?? ??0000B014  ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? ?? ?? ?? ?? ??0000B024  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00...0000B0C8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 000000B0D8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00...

Finally found that the 0X0000B0C8 address slice instructions are all 0, when dynamic link, linker will overwrite the value of 0x0000b004 address, point to the real address of strlen (instead of the current 0X0000B0C8, a bit around).

Local function pointers Call external functions

Local_strlen1 and Local_strlen2 call, corresponding to 0x00005100 and 0x0000510c two BLX instructions, by calculating the value of the final R3 is *strlen_prt, that is, 0x0000af08, Just corresponds to the R_arm_glob_dat relocation item in. Rel.dyn, so we conclude that by calling the external function with a local function pointer, the relocation type is R_arm_glob_dat and is located in the. Re.dyn section .

We only analyze Local_strlen1 's invocation process, first locating to STRLEN_PRT (0X0000AF08), which is located in the. Got section, aboveglobal_offset_table , and then through Strlen_ PRT, which locates to 0x0000b0c8, is the same as the result of the analysis above, so the r_arm_glob_dat of the re-entry points to the address of the final calling function address (that is, the pointer to the function pointer). Here is the 16-binary representation fragment of the. Got segment:

0000AF08  C8 B0 00 00 04 B0 00 00  08 B0 00 00 DC B0 00 000000AF18  B4 87 00 00 F4 84 00 00  60 5B 00 00 58 5B 00 000000AF28  50 5B 00 00 EC B0 00 00  FC 8C 00 00 00 00 00 00...0000B0C8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 000000B0D8  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00...

Note that the 0x000050d8 instruction "STR R3, [SP, #0x40 +var_24]", where the real address of the function has been saved to the stack, so even if we modify the got table will not affect the value of the stack, Therefore, this relocation type cannot be hook by modifying the address.

Calling external functions directly

Finally, take a look at the direct invocation of strlen, corresponding to the BLX instructions at 0x0000511a and 0x00005122 two, and finally they all point to the. PLT section directive, as follows:

.plt:00002E38                 ADR             R12, 0x2E40.plt:00002E3C                 ADD             R12, R12, #0x8000.plt:00002E40                 LDR             PC, [R12,#(strlen_ptr_0 - 0xAE40)]! ; __imp_strlen...0000AFDC  C8 B0 00 00 CC B0 00 00  D0 B0 00 00 D4 B0 00 00 0000AFEC  D8 B0 00 00 DC B0 00 00  E0 B0 00 00 E4 B0 00 00 0000AFFC  E8 B0 00 00 00 00 00 00  C8 B0 00 00 C8 B0 00 00 ...

Finally, the PC points to *strlen_ptr_0, Strlen_ptr_0 's address 0X0000AFDC, which is located in the. Got section, and the 0X0000AFDC address value is exactly 0x0000b0c8, how familiar the figure. Therefore, it is concluded that the external function is called directly, its relocation type is r_arm_jump_slot, and it is located in the. Re.plt section, and its offset points to the address of the final calling function address (that is, the pointer to the function pointer). The whole process is first to. PLT, then to. Got, and finally to the real function address.

In this part of the analysis, there are some differences between Ida and Objdump, and the following is the assembly instruction through Objdump:

00002e38 <[email protected]>:    2e38:   e28fc600    add ip, pc, #0, 12    2e3c:   e28cca08    add ip, ip, #8, 20  ; 0x8000    2e40:   e5bcf19c    ldr pc, [ip, #412]! ; 0x19c......    afd8:   00002c50    andeq   r2, r0, r0, asr ip    afdc:   00002c50    andeq   r2, r0, r0, asr ip    afe0:   00002c50    andeq   r2, r0, r0, asr ip    afe4:   00002c50    andeq   r2, r0, r0, asr ip

See AFDC address, point to is 0X00002C50, and 0X00002C50 exactly is plt[0], instructions are as follows:

00002c50 <[email protected]>:    2c50:   e52de004    push    {lr}        ; (str lr, [sp, #-4]!)    2c54:   e59fe004    ldr lr, [pc, #4]    ; 2c60 <[email protected]>    2c58:   e08fe00e    add lr, pc, lr    2c5c:   e5bef008    ldr pc, [lr, #8]!    2c60:   000082d4    ldrdeq  r8, [r0], -r4

After executing the 2C5C command, the final PC points to 0x0000af3c, which is exactly global_offset_table + 8, or got[2], where we see the 0x0000af3c:

0000AF3C  00 00 00 00 28 B0 00 00  24 B0 00 00 2C B0 00 000000AF4C  30 B0 00 00 34 B0 00 00  38 B0 00 00 3C B0 00 00

It turns out that the function address pointed to in got[2] is actually 0, because the symbol bindings on Android do not support lazy binding, so when so is loaded, linker will pre-got[n] (n>=2) The corresponding functions are found in advance, so here got[2] Code will not actually be executed, so there is no complete plt/got link process on Android today. Guess this is mainly due to stability considerations.

Summarize

Although Ida and obudump two tools decompile the instructions in the Plt\got process some differences, but for Android, this difference does not affect, because the lazy binding is not supported on Android. At the same time we come to a very important conclusion:R_arm_abs32, R_arm_glob_dat, and R_arm_jump_slot are not the same in code, but their offset is a pointer to a pointer to a function. This is very useful for us to elfhook below.

Parsing elf based on Execution view

Redirecting functions in shared ELF Libraries This article provides an example of how ELF is parsed based on a link view, which is basically the same as when parsing based on an execution view. The key is to find the. Dynsym,. Dynstr,. Rel.plt, and Rel.dyn, and their number of entries through segment.

For the first time, a segment of type pt_dynamic is found by the Program Header table, which corresponds to. dynamic, which corresponds to an array of type Elf32_dyn, with the following structure as follows:

/* Dynamic structure */typedef struct {    Elf32_Sword d_tag;      /* controls meaning of d_val */    union {        Elf32_Word  d_val;  /* Multiple meanings - see d_tag */        Elf32_Addr  d_ptr;  /* program virtual address */    } d_un;} Elf32_Dyn;

By iterating through this array, we can find all the information we need, and I'll list the corresponding relationships:

    • Dt_hash. HASH
    • Dt_symtab & Dt_syment-Dynsym
    • Dt_strtab & Dt_strsz-Dynstr
    • Pltrel (decide REL or RelA) & (Dt_rel | Dt_rela) & (Dt_relsz | Dt_relasz) & (Dt_relent | dt_relaent). Rel.dyn
    • Dt_jmprel & Dt_pltrelsz & (Dt_relent | dt_relaent). rel.plt
    • Fini_array & Fini_arraysz-Fini_array
    • Init_array & Init_arraysz-Init_array

Here is the relevant code for the lookup:

void Getelfinfobysegmentview (Elfinfo &info, const elfhandle *handle) {info.handle = handle;    Info.elf_base = (uint8_t *) handle->base;    INFO.EHDR = Reinterpret_cast<elf32_ehdr *> (info.elf_base);    May is wrong INFO.SHDR = Reinterpret_cast<elf32_shdr *> (info.elf_base + info.ehdr->e_shoff);    INFO.PHDR = Reinterpret_cast<elf32_phdr *> (info.elf_base + info.ehdr->e_phoff);    Info.shstr = NULL;    Elf32_phdr *dynamic = NULL;    Elf32_word size = 0;    Getsegmentinfo (info, pt_dynamic, &dynamic, &size, &info.dyn);        if (!dynamic) {LOGE ("[-] could ' t find pt_dynamic segment");    Exit (-1);    } Info.dynsz = size/sizeof (Elf32_dyn);    Elf32_dyn *dyn = Info.dyn; for (int. i=0; i<info.dynsz; i++, dyn++) {switch (Dyn->d_tag) {Case DT_SYMTAB:info.sym = Rein            Terpret_cast<elf32_sym *> (info.elf_base + dyn->d_un.d_ptr);        Break Case DT_STRTAB:INFO.SYMSTR = ReintErpret_cast<const Char *> (info.elf_base + dyn->d_un.d_ptr);        Break            Case DT_REL:info.reldyn = Reinterpret_cast<elf32_rel *> (info.elf_base + dyn->d_un.d_ptr);        Break            Case DT_RELSZ:info.reldynsz = dyn->d_un.d_val/sizeof (Elf32_rel);        Break            Case DT_JMPREL:INFO.RELPLT = Reinterpret_cast<elf32_rel *> (info.elf_base + dyn->d_un.d_ptr);        Break            Case DT_PLTRELSZ:info.relpltsz = dyn->d_un.d_val/sizeof (Elf32_rel);        Break            Case dt_hash:uint32_t *rawdata = reinterpret_cast<uint32_t *> (info.elf_base + dyn->d_un.d_ptr);            Info.nbucket = rawdata[0];            Info.nchain = rawdata[1];            Info.bucket = RawData + 2;            Info.chain = Info.bucket + info.nbucket;        Break }}//because. Dynsym is next to. dynstr, so we can caculate the symsz simply Info.symsz = ((uint32_t) Info.symstr-(uint32_t) info.sym)/sizeof (elf32_sym);} 

However, there is a value I can not get through the pt_dynamic segment, that is. The number of dynsym, which I finally get through a workaround. Because the. Dynsym and. Dynstr two sections are adjacent, they are subtracted from the two addresses and can be obtained. The total length of the dynsym, in addition to sizeof (ELF32_SYM) can be obtained. The number of items Dynsym, if you have a better way, please tell me.

ELF Hook

With the introduction above, it is very simple to write an elf hook, and I post the key code:

#define R_ARM_ABS32 0x02#define r_arm_glob_dat 0x15#define r_arm_jump_slot 0x16int elfhook (const char *soname, const char    *symbol, void *replace_func, void **old_func) {assert (Old_func);    ASSERT (Replace_func);    ASSERT (symbol);    elfhandle* handle = Openelfbysoname (soname);    Elfinfo info;    Getelfinfobysegmentview (info, handle);    Elf32_sym *sym = NULL;    int symidx = 0;    Findsymbyname (info, symbol, &AMP;SYM, &AMP;SYMIDX);        if (!sym) {LOGE ("[-] Could not find symbol%s", symbol);    Goto fails;    }else{Logi ("[+] sym%p, symidx%d.", Sym, SYMIDX);        } for (int i = 0; i < Info.relpltsz; i++) {elf32_rel& Rel = info.relplt[i]; if (Elf32_r_sym (rel.r_info) = = Symidx && elf32_r_type (rel.r_info) = = R_arm_jump_slot) {void *addr = (v            OID *) (info.elf_base + rel.r_offset);            if (Replacefunc (addr, Replace_func, Old_func)) goto fails;        only once break; }} for(int i = 0; i < Info.reldynsz; i++)        {elf32_rel& Rel = info.reldyn[i];                        if (Elf32_r_sym (rel.r_info) = = Symidx && (elf32_r_type (rel.r_info) = = R_arm_abs32 ||            Elf32_r_type (rel.r_info) = = R_arm_glob_dat)) {void *addr = (void *) (info.elf_base + rel.r_offset);        if (Replacefunc (addr, Replace_func, Old_func)) goto fails;    }} fails:closeelfbysoname (handle); return 0;}

The

Finally is the code for the test:

typedef int (*strlen_fun) (const char *), strlen_fun Old_strlen = null;size_t my_strlen (const char *str) {logi ("strlen WA    s called. ");    int len = Old_strlen (str); Return len * 2;} Strlen_fun global_strlen1 = (strlen_fun) strlen;strlen_fun global_strlen2 = (strlen_fun) strlen; #define SHOW (x) LOGI ("%s Is%d ", #x, X) extern" C "Jint java_com_example_allhookinone_hookutils_elfhook (jnienv *env, Jobject thiz) {const char *s    TR = "HelloWorld";    Strlen_fun local_strlen1 = (strlen_fun) strlen;    Strlen_fun local_strlen2 = (strlen_fun) strlen;    int len0 = GLOBAL_STRLEN1 (str);    int len1 = GLOBAL_STRLEN2 (str);    int len2 = LOCAL_STRLEN1 (str);    int len3 = LOCAL_STRLEN2 (str);    int len4 = strlen (str);    int len5 = strlen (str);    Logi ("Hook Before:");    SHOW (Len0);    SHOW (LEN1);    SHOW (LEN2);    SHOW (LEN3);    SHOW (LEN4);    SHOW (LEN5);    Elfhook ("libonehook.so", "strlen", (void *) My_strlen, (void * *) &old_strlen);    Len0 = Global_strlen1 (str);    LEN1 = Global_strlen2 (str);Len2 = Local_strlen1 (str);    Len3 = Local_strlen2 (str);    LEN4 = strlen (str);    LEN5 = strlen (str);    Logi ("Hook after:");    SHOW (Len0);    SHOW (LEN1);    SHOW (LEN2);    SHOW (LEN3);    SHOW (LEN4);    SHOW (LEN5); return 0;}

From the printed results can be found, local_strlen1 and Local_strlen2 is said above, and not affected, but if the function is called again, it takes effect, the reason is not resolved. The test results will not be sent, leave you to try it.

Githup Address

Complete code, see Https://github.com/boyliang/AllHookInOne.git

Android-based Elf Plt/got symbol redirection process and elf hook implementation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.