Transparent encryption and decryption through hook Technology in Android to ensure data security

Source: Internet
Author: User
Tags field table

I. Preface

For users who store important private files on Android mobile devices, some encryption and storage software is usually used. However, the software that enables a private space on a mobile phone is very similar, but the problem is that you need to use the private space to open a file, decrypt the encrypted file to a temporary file, and then select an application to open the file. This will lead to the existence of important user files on the device, there is a risk of leakage.

In addition, according to my research, for the 360 privacy space, after the application modifies the temporary file, it can no longer reverse encryption back to the ciphertext, resulting in only one encryption operation. LBE is a relatively good private space, but its temporary files have a long life cycle.

Therefore, the author discusses a transparent encryption and decryption method using the hook technology through existing knowledge, and does not need to generate temporary files on the device, thus protecting users' Important privacy.

Ii. Technical Points

Because Android is an open source system based on Linux Kernel, it can be divided into Java layer, Native C layer, and Linux Kernel layer according to different language environments. Java layer security is developed in Java. Based on the SDK, the functions that can be implemented are relatively simple. For Linux Kernel layer security, you need to start from the source code, compile your own system, and the versatility is not strong. Therefore, on the Native C layer, through JNI development, more functions can be implemented using functions provided by linux.

The hook API is similar to the linux hook implementation using the ptrace function and the plt table. It can also be implemented using the Inline hook method, but it is not very stable and difficult to operate. In essence, function calls are all hijacked.

However, in Linux User Mode, each process has its own independent process space. Therefore, you must first add the process space to be hooked and modify the Process Code in its memory, replace the symbolic address of the Process Table. Therefore, the living space of the table is the injected process. You can only HOOK a process.

The Ptrace function is used to debug a program. It has powerful functions. It can not only append a process (PTRACE_ATTACH), but also modify the memory space (PTRACE_PEEKDATA) of the target process (READ memory. PTRACE_POKEDATA, write memory), and even registers (PTRACE_SETREGS, PTRACE_GETREGS)

The basic process is to use the register command to interrupt:

① PTRACE_ATTACH: bind the target process.

② PTRACE_GETREGS, get the status of the target process register, and save it.

③ PTRACE_PEEKDATA works with PTRACE_POKEDATA to save the original code and write the code to be injected to the current running position.

④ PTRACE_SETREGS: restores the register status and continues execution. This indicates that the injected code starts to be executed in the target process and the injected code completes the HOOK. The process is similar to that in Windows.

⑤ After the HOOK is complete, the injected code execution int3 is captured by ptrace, and the target process is paused again.

⑥ PTRACE_GETREGS: Save the register again.

7. Use PTRACE_PEEKDATA and PTRACE_POKEDATA to restore the code.

Resume PTRACE_SETREGS, resume the Register, and continue the execution of the target process.

Revoke PTRACE_DETACH to unbind the target process.

Refer to the LBE implementation principle and the articles on Hook Ioctl in the snow are basically implemented according to this process.

After understanding the hook mechanism, find the dynamic link library in which the symbols of the open and close functions exist for transparent encryption and decryption on Android, and hook the dynamic link library of the application, during the open operation, the ciphertext file is segmented and decrypted to the memory, and the file identifier in the memory is returned. During the close operation, the plaintext in the memory is encrypted to the local ciphertext storage.

3. Key Processes

1. Read the Android code to find and close the file. This is the key to achieving transparent encryption and decryption.

See http://blog.chinaunix.net/uid-26926660-id-3326678.html

It can be found that the function for reading the file stream is finally implemented through the JNI read function, and the operations for opening the file are all attributed to the open function.

The dynamic library supported by JNI for Java code implementation is nativehelper. so, we need to hook the dynamic library nativehelper. so.

Note: In earlier Android versions, android2.3 and Android4.0, the open and close symbols are in nativehelper. so, and the file size is kb. In Versions later than android 4.0, Google overwrites the implementation of the android native library, and nativehelper. so is split. I did not read the development on the android 4.1 platform to find the version above.

2. process injection and ELF replacement

Process injection is a technology that copies a piece of code to the target process and then allows the target process to execute the code. Because such code is complex to construct, in actual circumstances, only a few codes are injected into the target process, and the code that actually works is put into a shared library, that is. so file. The injected code is only responsible for loading this. so and executing the functions in it. Because. so functions are executed in the target process. the function in so can modify any memory of the target process space. You can also add hooks to change the working mechanism of the target process.

Of course, not all processes have the permission to perform the injection operation. Process injection on the Android platform is based on ptrace (). To call ptrace (), you must have the root permission. Currently, mainstream security software on the market also manages and controls other application processes based on process injection. This is why these security software requires root permissions.

For more information about how to implement. so injection, see an open-source project Android Injector Library of LibInject and shujia on the snow forum.

I have tried both methods. For Libinject, libhook. so is injected into the target process. First, call the ptrace () function to suspend the process. Then traverse the libc loaded by the process. so, use the dlopen and dlsym functions to modify the value of the arm register, then press the parameter and so path, and press the previously found dlopen address into the Register to directly operate blx, the target process can call dlopen to load our so. Similarly, dlsym calls the function in so.

After the injection is completed, libhook is called. the hook_entry () function in the so library. This function loads the dynamic library implemented by the hook function and performs libnativehelper. so's got table and plt table traversal and modification. Modify it to the symbolic address in the dynamic library that implements the open and close functions. Therefore, two libraries need to be injected, because libhook. so needs to detach the target process after execution, so as to release, and the dynamic library of the specific operation needs to be resident in the memory. To implement resident memory, you must explicitly load the dynamic library in the hook_entry () function.

The above injection process is implemented by self-programming, which can deepen the understanding of the ELF format.

Using Android Injector Library is relatively simple. You can implement it in calling the main function of the executable program:

This file is compiled to generate the injection entry and symbol table replacement logic.

* 1. Load libhook. so in this function and use the do_hook function to return the original open and close addresses and the new open and close function addresses to be replaced.

* 2. Open the libnativehelper dynamic library statically, read its schema traversal section table, and find the global symbol table (GOT table), which stores the external dependent symbol address.

* 3. traverse the GOT table to find the original open function and close function addresses, and replace them with the new open function and the new close function respectively.

3. In the process of learning this, you need to understand the ELF format of linux. Here are the notes for learning ELF: refer to "Programmer self-cultivation". skip this step if you are familiar with it.

The ehdr-> e_shstrndx index points to the shstrtab section, which can be used to index the string Name Description of the section header. The shstrtab Table (Section Header String Table) stores the strings used in the field Table, the most common is the segment name,

Description of common segment names. rodata1Read Only Data, which stores read-Only Data, such as string constants and global const variables. And ". same as rodata. comment stores the compiler version information. debug debugging information. dynamic Link information of dynamic. hash symbol hash table. line number table during line debugging. note additional compiler information. For example, the company name and version number of the program. strtab String Table. string table. symtab Symbol Table. symbol table. shstrtabSection String Table. segment table. plt. got dynamic link jump table and global portal table. init. fini program initialization and termination code segment

Symbol section. Determine whether the type of each section is SHT_SYMTAB or SHT_DYNSYM. The corresponding section is the symbolic section. Symbol storage stores a symbol table, and the symbol table is also a continuously stored structure array.

Variables and functions used in programming can be called symbols. There is not only one symbolic section in an ELF file, but usually two and one is ". the dynamic section type of dynsym is SHT_DYNSYM. All introduced external symbols are shown here, and the other is SHT_SYMTAB named ". symtab saves all useful symbol information.

The Symbol Table stores the definition and reference information required by a program for location and relocation. A symbol table index is a subscript. The existence of a symbol table is embodied in the fact that when multiple target files are linked, in the link, the target files are combined with each other to reference the addresses of the target files, it refers to the reference to the addresses of functions and variables, and functions and variables can be collectively referred to as symbols. function names or variable names are Symbol names ). We can regard the symbol as the adhesive in the link, and the entire Link process is completed correctly based on the symbol. In the symbol table ". in symtab, it is also an array like the structure of a field table. Each array element is a fixed structure to store information about symbols, such as the symbol name (not a string, the subscript of the symbol name in the string table), the value corresponding to the symbol (which may be the offset in the segment, or the virtual address of the symbol), and the size of the symbol (the size of the Data Type) and so on. The symbol table records global symbols, such as global variables and global functions.

The symbol table of the target file contains the information required to locate or relocate the program symbol definition and reference. The entry structure of the symbol table is defined as follows:

typedef struct{Elf32_Word st_name;Elf32_Addr st_value;Elf32_Word st_size;Unsigned char st_info;Unsigned char st_other;Elf32_Half st_shndx;}Elf32_Sym;

St_name contains the index pointing to the string table (strtab) to obtain the symbol name. St_value indicates the value of a symbol, which may be an absolute value or an address. St_size indicates the memory size related to the symbol, such as the number of bytes contained in a data structure. St_info specifies the type and binding attribute of the symbol, indicating whether the symbol is a data name, function name, section name, or source file name, and whether the binding attribute of the symbol is local, global, or weak.

GOT table and PLT table

Each item in the GOT (Global Offset Table) Table is the address of a Global variable or function to be referenced by this running module. You can use the GOT table to indirectly reference global variables and functions, or use the first address of the GOT table as a baseline, and use the offset relative to this benchmark to reference static variables and functions. Since the loader does not load the running module to a fixed address, the absolute address and relative location of each running module are different in the address space of different processes. This difference is reflected in the GOT table, that is, each running module of each process has an independent GOT table, so the GOT table cannot be shared between processes.

Dynamic Link Mechanism

First, let's review the Dynamic Link Mechanism on the Linux platform when a module A needs to call functions in another Module B:

1. during compilation, module A writes the name and function name of Module B to its own symbol table.

2. When module A is called during the runtime, the call process is from the call code to the PLT table to the GOT table and then to the module B.

How to ensure that module A's code can jump from its PLT/GOT to the correct Module B portal is what the linker does.

The standard Linux linker is ld. so and supports lazy binding. That is to say, the original code of calling Module B generated by module A during compilation is from the call code to the PLT table to the linker. When Module B is called for the first time in the runtime, it first enters the linker. The linker loads Module B based on the call information to search for its symbols and fill in the found function address in the GOT table, the subsequent calling process follows the PLT/GOT table. This mechanism can reduce the loading overhead and be used for Linux distributions.

Although Android is based on Linux, its dynamic link mechanism is not ld. so, but its built-in linker. It does not support lazy binding. That is to say, if module A and Module B are on the Android platform, when module A is loaded, linker will follow the steps in module. rel. load Module B from Table plt and string, search for the required function address, and enter the GOT table in advance. After that, the calling process goes directly to the PLT/GOT table every time and no longer enters the linker. The PLT table also saves the code that jumps to the linker. This process is similar to the "hardworking" binding, it provides some convenience for interception. If Module B does not load the address when it intercepts the entry to which it is bound, the interception will fail.

To intercept calls from module a to module B, the general idea is to remotely inject and load a new interception module to module A through ptrace, and search for the GOT table of module, find the call address for Module B and change it to a function address in the new module. Then, after processing the function in the new module, it jumps to Module B.

The differences between the Android and Linux connectors lead to memory layout differences, and also lead to the failure of popular Linux injection and HOOK methods on the Internet. The method on the Internet is to search the PLTGOT area in the dynamic section after ptrace injection, and retrieve the link_map in it to traverse the modules loaded by this process to search for the function address to be hooked. However, on Android, the first few items in the PLTGOT section of dynamic are empty. Without the link_map data structure, you can only analyze the data by using/proc/ /Maps to traverse the module.

4. Notes for reading the code

There are several notes when reading the Android Injector Library.

1) Capture Invalid Memory Reference of SIGSEGV or abnormal signal of segment error to execute ptrace.

2) ptrace (PTRACE_PEEKTEXT, pid, addr, data)

Description: reads a byte from the memory address. The pid indicates the sub-process to be tracked. The memory address is provided by addr. data is the user variable address used to return the read data.

In Linux (i386), user code segments overlap with user data segments, so reading code segments and data segment data processing are the same.

3) linker is mainly used to load and link shared libraries. It supports implicit and explicit calls by applications to library functions. Search for libdl. so loaded in/system/bin/linker. The loading location is fixed and dlopen, dlcose, dlsym, and dlerror are defined.

4) understand the following code, that is, the relationship between dynsym and symtab.

5) An error occurred while reading the dynsym symbol in the code. But it does not affect usage. Use the readelf tool under androidSDK

5. Write your own open and close functions for encryption and decryption.

This process uses openssl EVP programming on the Android platform, which is not very difficult.

The key point is to use the key space for construction. Array is recommended for the key space. Using a char * string even if there is ''at the end of the string will affect key initialization due to other content in the memory, and unexpected problems may occur.

Key Aspect 2: in close mode, the parameter only has a file descriptor. You can obtain the file name through the following code.

The key aspect 3 is that the block size will be filled when you use Openssl for symmetric encryption and decryption. You need to manually strip the padding. You can use the universal filling method to construct the filling mode, or construct the filling size of the ciphertext File Header records.

1) http://en.wikipedia.org/wiki/Padding_ (cryptography)

6. Remember to add it to the Makefile file.

LOCAL_LDLIBS + =-L $ (SYSROOT)/usr/lib-llog

LOCAL_LDLIBS + =-L $ (SYSROOT)/usr/lib-lcrypto

LOCAL_LDLIBS + =-L $ (SYSROOT)/usr/lib-lssl

7. You need to develop the key management module. This process is not described.

Iv. Summary

This solution can achieve transparent encryption and decryption on the android platform. The disadvantage is that you need to use the root permission to capture and hook user programs in advance. Efficiency is a bottleneck on mobile devices, and files should not be too large. Docxlspdfppttxt and other text and jpg images are well supported. I did not test files in other formats.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.