Introduction to Linux Kernel Engineering-process: Elf File Execution principle (2) _

Introduction to Linux Kernel Engineering-process: Elf File Execution principle (2) __linux

Last Update:2018-07-26 Source: Internet

Author: User

Tags one table readable

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

ELF strong sign and weak sign (this section is for others)

We often encounter a situation in programming that is called a symbol repetition definition. Multiple destination files contain definitions of the same name global symbol, then the target file link will appear when the symbol is repeatedly defined error. For example, we have defined a global reshaping variable global for both target file A and target file B, and have initialized all of them, so the linker will make an error when linking A and B:

1 b.o: (. data+0x0): Multiple definition of ' global '
2 A.O: (. data+0x0): defined here

The definition of this symbol can be called a strong sign (strong symbol). Some definitions of symbols can be called weak symbols (Weak symbol).

The compiler default function and the initialized global variable are strong symbols for C + + languages, and uninitialized global variables are weak symbols . We can also define any strong symbol as a weak symbol through GCC's "__attribute__ (weak)". Note that strong and weak symbols are both for definitions, not references to symbols. For example, we have the following procedure:

extern int ext;
int weak;
int strong = 1;
__ATTRIBUTE__ ((weak)) WEAK2 = 2;
int main ()
{
return 0;
}

In this program, "weak" and "WEAK2" are weak symbols, "strong" and "main" are strong symbols, and "ext" is neither a strong symbol nor a weak symbol because it is a reference to an external variable.

For the concept of strong and weak symbols, the linker handles and selects the global symbols that are defined more than once by following these rules:

· Rule 1: Do not allow strong symbols to be defined multiple times (that is, a strong symbol with the same name cannot be in a different destination file); If there are more than one strong symbol definition, the linker newspaper symbol repeats the definition error.

· Rule 2: If a symbol is a strong symbol in a target file and is a weak symbol in other files, select the strong symbol.

· Rule 3: If a symbol is a weak symbol in all target files, select the one that occupies the most space. For example, target file a defines global variable globally as int, which is 4 bytes; The destination file B defines global as a double and 8 bytes, then after the target files A and B are linked, the symbol global occupies 8 bytes (try not to use multiple different types of weak symbols. Otherwise it can easily lead to difficult to find program errors).

weak references and strong references

The symbolic references to the external target files that we see at the moment when the target file is eventually linked to an executable file need to be properly resolved, and if the definition of the symbol is not found, the linker will report the symbol undefined error, which is called a strong reference (strong Reference). There is also a weak reference (Weak Reference)that, if the symbol is defined when the weak reference is processed, the linker will refer the symbol to the resolution; If the symbol is not defined, the linker does not complain about the reference. The linker handles strong references and weak references almost as much, but the linker does not think it is an error for undefined weak references. Typically, for an undefined weak reference, the linker defaults to 0, or a special value so that the program code can recognize it.

In GCC, we can declare a reference to an external function as a weak reference by using the Extended keyword "__attribute__ (weakref)", such as the following code:

1 __attribute__ ((weakref)) void foo ();
2 int Main ()
3 {
4 foo ();
5}
6

We can compile it into an executable file and GCC does not report a link error. But when we run this executable file, a run-time error occurs. Because when the main function attempts to invoke the Foo function, the Foo function has an address of 0, and an illegal address access error occurs. An example of improvement is:

1 __attribute__ ((weakref)) void foo ();
2 int Main ()
3 {
4 if (foo)
5 foo ();
6}
7

this weak sign and weak reference are useful for libraries, for example, weak symbols defined in libraries can be overwritten by user-defined strong symbols, allowing programs to use a custom version of the library function, or the program can define a reference to some extended function module as a weak reference, when we link the Extension module to the program , functional modules can be used normally, if we remove some of the functional modules, then the program can also be linked to the normal, just missing the corresponding function, which makes the function of the program easier to cut and mix.

In the design of Linux programs, if a program is designed to support single-threaded or multi-threaded patterns, it is possible to determine by weak references whether the current program is linked to a single-threaded glibc library or a multithreaded glibc library (with-lpthread options at compile time). Thus executing a single-threaded version of a program or a multithreaded version of a program. We can define a weak reference to a pthread_create function in the program, and then the program dynamically determines whether to link to the Pthread library at run time to decide to perform a multithreaded version or a single-threaded version. Audit Interface

When compiling, you can pass--audit auditlib parameters to the LD, so that you create a dt_audit section,ld.so see this section to execute GLIBC interface as defined by this section. When a particular event occurs, a specified function is found in the compiled library to execute. For example, when a program calls Dlopen to open a dynamic library, an event occurs that invokes the La_objopen () function in the specified library.

This interface is used to make statistics on the connectivity dimension, and is used by the developer of the connector. But developers can use this library, which is distributed separately, to manage and replace the function implementation version. For example, when loading a library, we first invoke the audit interface, and in the audit interface we replace the library we are going to open.

Dynamic Library

The core of the dynamic library contains two levels of code sharing: the compiled binary can contain a copy of the dynamic library without each binary file; At the time of execution, the code snippet for the dynamic library needs to be loaded only once, followed by someone using the same dynamic library, The kernel can map a dynamic library snippet to a page that is directly mapped to other required processes, so that the code snippet does not have to be loaded more than once (the code snippet is read-only).

and loaded into memory, because the CPU execution must use a relative or absolute address, the code snippet each function, although the physical address is the same, but they map to each process memory space address is different. So a dynamic library needs to have a symbol table, record its offset in the code snippet, and then need a global offset, which means that its entire symbol table is offset in the process address space.

An executable file that uses a dynamic library has code that calls the symbols inside. This code is unable to resolve specific address offsets during the link period (because the linker has no actual link to them), so they are just a placeholder in the binary file, and their addresses need to be populated after the dynamic library has been loaded. And this call is scattered throughout the program, so after loading you have to search to find all the unresolved symbols to resolve, so it is certainly not appropriate, so you need to have a table, recorded all these unresolved symbols.

Dynamic library in memory as long as the execution of the code to any row of dynamic libraries, the dynamic library can be offset to find other symbols in this library, because the same library symbol offset, the library is known. But unfortunately, I386 does not support addressing via the current execution instruction (PC) (if support is simple, no redistribution is necessary, just use the offset in the code being executed). But x64 is supported, so the elf may slowly change somewhat in the x64 era.

The above mentioned the requirements of executable files and dynamic libraries, the convergence of which is through. Got and. PLT segments. Got is a data external parsing,. PLT is a function outer parsing. Elf file link complete, the symbols that call the dynamic library are placed in both tables, and when the dynamic library is loaded, the loader is responsible for locating the table, populating the address of the memory for the corresponding symbol of the loaded dynamic library into the executable file, so that the symbolic binding is completed. At the same time, it solves the problem that the dynamic library position is not fixed.

function Call Stack

X86 and x64 provide a stack of register pointers, but do not specify how to use the stack, such as the order of the parameters into the stack, where the return value is placed, two calls between the empty space.

In the x86 era, the commonly used call stacks are: stdcall, ThisCall, Fastcall, cdecl, which are different in the use of the stack. In the x64 era, only one fastcall was left. For example, the calling convention for StdCall means: 1 The parameters are pressed from right to left to the stack, 2 the function itself modifies the stack 3, the functions are automatically underlined with a leading, followed by an @ symbol followed by the dimensions of the parameter. StdCall because of the early use of Pascal has the honor. The default of C is the cdel,cdecl calling convention, the parameter stack order is the same as the stdcall, and the parameters are first pressed from right to left into the stack. The difference is that the function itself does not clean the stack, and the caller is responsible for scavenging the stack. Due to this change, the C calling convention allows the number of parameters of the function is not fixed, which is a major feature of C language. ThisCall is to resolve object-oriented function calls to transfer the this pointer by default, so it is the default invocation of C + +, and the parameters are pushed from right to left to the stack.

and fastcall use registers to pass parameters, because in the x64 environment, a lot of registers, so that the fastcall of the first 4 integers and floating point into the register, more than the part of the stack. Therefore, using fastcall can significantly speed up the call. So, when writing code, try to use 4 of the following function parameters. Fastcall also retains the flexibility of the CDEL, the caller cleans up the stack, so it can also be done without a fixed parameter. But you see your stack may find an extra space, x64 will default to the station to allocate a backup space, for core dump analysis is convenient. This space holds the registers that each time a function call occurs. If you turn on compiler optimizations, this space is generally not preserved.

Dynamic Library Load

When one participates in a dynamic link, its interior contains pt_dynamic segments, which contain. Dynamic this section,.rel.plt section is for function relocation,. Rel.dyn section is for variable relocation ... got Section holds the global variable offset table,. GOT.PLT Sections Store the global Function deviation table. The DYNSYM node contains dynamic link symbol tables. The PLT section is a process link table. The Procedure link table redirects the position-independent function call to an absolute position.

During the execution of a program, some C library functions that may be introduced will not be executed at the end. So the ELF uses the technology of delaying binding to find the real place to bind the first time the C library function is called. But there are also bindings that are set to load when they are loaded (as is the case with RELRO attack warfare technology).

An application consists of a major elf binary file (executable) and several dynamic libraries, all of which are in elf format. Each Elf object consists of multiple segments, each segment containing one or more sections. These sections look a lot, but most of them are very simple. Each segment basically stores only one type of data, such as. Dynstr inside is a string. Each segment should be interpreted as a table, not an array of structures. This table contains the same structure of data, and basically only one or two kinds of data. The concept of a struct is more horizontal, and a struct may contain more than one table.

For example, a pile function in rel.plt for external symbols to be resolved. When each elf accesses an external symbol, the first entry into the. Rel.plt of the corresponding pile function, this pile function will enter the corresponding GOT.PLT in the corresponding external symbols, and the symbolic address in the. GOT.PLT, so that later access to the The pile function in the REL-PLT can be taken directly from the. Got.plt. This is the principle of lazy loading.

The entries in each. rel.plt point to A. Dynsym entry, and each. Dynsym entry points to a. Dynstr entry. Dynstr Inside only strings, The data stored in the. Dynsym has the virtual address of this symbol (when not executed, the virtual address is naturally 0) and the type and binding type of the symbol, as shown below:

The elf file has two kinds of section: can be allocated and not allocated, can be allocated in the runtime will be loaded into memory, not allocated to the debugger, in the execution of the time is useless. The Strip program can remove the unwanted items from these execution periods, making the file smaller. There are two kinds of notation tables:. Symtab and. Dynsym,.dynsym is a subset of. Symtab,. Dynsyn is required for runtime, and. Symtab debugging required. Strip can remove. Symtab.

http://www.inforsec.org/wp/?p=389

Elf Security

When root permissions are available, the kernel has no secrets. Most people have access to the root, and there is not much to see, but Linux has been provided, but we do not find a way to see.

For example, we can view the contents of any physical memory by opening the/DEV/MEM device and then mmap it to your program, and read it directly. We can also view any kernel data (not just the proc and SYS file system burst out of information) by opening the/PROC/KMEM device and then directly reading

Each of our processes can view all of its own readable memory by using/PROC/<PID>/MEM, you may be cat this file is always wrong, because not all memory is mapped by the process, especially the location where the file starts, so it needs to be based on the/proc/<pid The >/mmaps file finds the specific file mapping pattern, and then seek to the corresponding offset to read. This kind of demand basically does not have, because since is our own process, we within the program naturally also can read completely.

Because the/PROC/<PID>/MEM permission is only readable by itself, other processes must be ptrace to the process if they want to read it. But Root is readable, but the program still has to be paused to read. Direct read memory is not very good (race), using the Gcore command can stabilize the entire memory export to the file.

Features such as Config_strict_devmem and CONFIG_IO_STRICT_DEVMEM after the kernel gradually limit the access memory capabilities of/DEV/MEM files, so the new kernel is not so easy to access memory.

Executable Stack

The earliest Elf attack was shellcode on the stack, and officially because of this, the method was first defended. Now the general stack does not have executable permissions, but control this executable permission is the elf file itself section, so if you have the right to modify the elf file This is not a problem. Compiling with GCC allows you to open the stack's execute permissions with the-Z execstack, or use the Execstack shell command.

GCC supports one extension for C, which is the nested definition of the function. This definition is done by putting nested function codes on the stack, which requires that the stack has executable permissions. If the code does not use this feature, the stack's executable permissions will be opened, so if the code security requirements are relatively high, do not use nested functions. By default, the compiler shuts down executable permissions on the stack's corresponding section, such as Gnu_stack. If you have permission to modify this file, you can break the blockade.

There is a compiler added protection, the stack between the gap, this gap is not mapped memory, if this gap is accessed, then is segment fault. So many injected shellcode can cause the program to crash and inject failure. Or this option will join the character specified by GCC, and if you modify it, the Check function (__stack_chk_failed) that he performs will fail, causing the program to exit. The latter one is called Stack Canary.

Return to libc

RETURN-TO-LIBC attack is a kind of computer security attack. This type of attack is typically applied to a buffer overflow where the return address in the stack is replaced with the address of another instruction, and a portion of the stack is overwritten to provide its parameters. This allows an attacker to invoke an existing function without injecting malicious code into the program. A shared library called LIBC provides C run-time support in a UNIX-like operating system. Although attackers can get code back to any location, the majority of cases are libc. This is because the libc is always linked to the program, and it provides some fairly useful functions for the attacker (such as system () calls that can only attach one parameter to execute an external program). This means that although the return address can point to another completely different region, the attack is still known as the cause of RETURN-TO-LIBC.

This attack must know the exact address of the system call, and now the libc is typically loaded into memory as a dynamic library, and the address is random. So the general need to first probe this address. The method of detection is to use the ld.so dynamic state. This work is done by ld.so because the program must load and parse the dynamic library. So this is an out-of-the-box entrance.

This attack method is usually used when the execution code on the stack is not available, and now the general machines have set the NX bit on the corresponding section of the stack, this bit can prevent stack data from being executed and can be gnu_ by elf files. The default RW property of this section of stack is changed to RWE property so that the stack can execute code, if the stack can execute code, return to libc seems superfluous.
Non-fixed location compilation

ASLR can load a dynamic library into a random memory address, which can increase the difficulty of debugging an attacker. But the executable itself has a fixed starting address in most cases, which makes it easier for attackers to do so. But there is still a way to make this address random, which is pie (Position Independent executable), which compiles the binary into a location-independent file, but the kernel completes the location-independent randomization process. So this feature requires kernel support. And there is a need for location-independent, that is, dynamic libraries and. o the intermediate code generated by this compilation is similar to the techniques and ideas used.

If we use the-fpic parameter, we can generate a location-independent dynamic library, and if we use the-fpie parameter, we can generate a location-independent executable file. There's a lot of difference in the use of the two, one is for others to load, one for direct execution, but the two are technically very different, with two major differences:-fpic generated files plus a PT_INTERP segment and some startup code, it's more like-fpie generated location-independent processes. And they can even use the same startup code. -fpic because it is used to generate dynamic link libraries, so the symbol can not be directly resolved to find the symbol, or even allow the symbol can not be found, the dynamic library itself allows reference to external libraries, so in the compilation of their own time does not need to link the external library, just want to use it to the external library function into the PLT table, The time of the link or the load time parsing is good, but the executable program requires all symbols to parse immediately, and does not allow unresolved symbols.

The above program is not pie, you can see all the addresses are absolute addresses, the first load indicates the 0x08048000 address at the beginning of the binary file to the 0x16f88 byte to be loaded into memory. The binary file offsets the 0x016f88 0x01543 bytes to be loaded into the 0x0805ff88 location of the memory. Therefore, this is a fixed position of the executable program.

The main purpose of ASLR is to prevent shellcode, he can make the encoding in a specific location shellcode can not be found to execute. It should be noted that ASLR is a kernel-side technology, that is, the memory chaos of the stack is done by the kernel. The Linux kernel is enabled by default, and echo 0 >/proc/sys/kernel/randomize_va_space can be turned off, and so on. But there is a very easy way to bypass Aslr,setarch ' uname-m '-r/bin/bash without the need to raise the command to set up bash startup without using ASLR.

Another problem with this technique is that when a program starts, the stack address is random, but when the program is started by another program, the address of the stack has a pattern, such as using the Execl interface call to start the program. Relro We can see that the main idea of elf intrusion is to inject code where it is not. Relro the appearance of this segment is to make a part of the area into read-only. For example, Ctors, DTORS,.JCR and so on are often placed in this section. Unlike the stack's not executable attribute is guaranteed to implement on the kernel side, this technology's read-only setting exists on the user side, is the compiler and the loader completes together.

Binary analysis Tools

Bat (binary analysis Tool), Bitblaze,angr,codesonar (commercial), Bap,execstack, Setarch

X64

In the x64 era, the System ABI for x86_64 was used heavily, and the ABI made most of the previous attack techniques more difficult, and the previous use of the registers or stack utilization became less easy. But it's not that easy. Compared with the mature technology kits developed so many years ago by x32, the x64 attacks will gradually mature over time. There is no absolute defense, only the improvement of difficulty.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More