One, to get an executable procedure 1. preprocessing, compiling, linking
gcc hello.c -o hello.exe
GCC compiles the source code to generate the final executable binaries, and the GCC background implicitly performs four stage steps.
预处理 => 编译 => 汇编 => 链接
Preprocessing: The compiler compiles the header files contained in the C source code and performs the macro substitution work.
gcc -E hello.c -o hello.i
Compile: gcc first to check the code of the normative, whether there is a syntax error, to determine the actual code to do the work, after checking the correct, GCC to translate the code into assembly language.
gcc –S hello.i –o hello.s-S:该选项只进行编译而不进行汇编,生成汇编代码。
Assembly: Turns the build of the compilation phase .s文件
into binary target code.
gcc –c hello.s –o hello.o
Link: Link the compiled output to .o文件
the final executable file.
gcc hello.o –o hello
Run: If the link is not specified by-O, the resulting executable file defaults toa.out
./hello
2. Destination file format
A.out--coff--elf (Linux) or PE (Windows)
Three important format files of elf--are highlighted
- relocatable Files: Save code and appropriate data to create an executable file or a shared file with other object files.
- Executable: holds a program to execute, which indicates how EXEC (BA_OS) creates the process image of the program.
- Shared files: Save the code and the appropriate data to be linked by the following two linker.
- The first is the connection editor [see LD (SD_CMD)], which can be used to relocate and share an object file to create additional objects.
- The second is a dynamic linker that unites an executable file and other shared object files to create a process image.
- The object file participates in the link (creation) and execution of the program.
3. Statically linked elf executable file and process address space
- Entry point: The program starts with 0x804800.
- The first line of code that the executable file loads into memory to begin execution.
- A general static link will put all the code in the same code snippet.
- A dynamically connected process will have multiple code snippets.
Ii. execution Environment for executable procedures 1. command-line arguments and the shell environment
List directory information under/usr/bin
$ ls -l /usr/bin
The shell itself does not limit the number of command-line arguments, the number of command-line arguments is limited by the command itself
int main(int argc, char *argv[], char *envp[])
The shell calls Execve to pass command-line arguments and environment parameters to the main function of the executable program.
int execve(const char * filename,char * const argv[ ],char * const envp[ ]);
The library function exec* are EXECVE encapsulation routines
2. Saving and passing command-line arguments and shell environment variables
shell程序 => execve => sys_execve
3. Executable program dynamic Link (1) dynamic Link
Concern:load_elf_binary
load_elf_binary(...){ ... kernel_read();//其实就是文件解析 ... //映射到进程空间 0x804 8000地址 elf_map();// ... if(elf_interpreter) //依赖动态库的话 { ... //装载ld的起点 #获得动态连接器的程序起点 elf_entry=load_elf_interp(...); ... } else //静态链接 { ... elf_entry = loc->elf_ex.e_entry; ... } ... //static exe: elf_entry: 0x804 8000 //exe with dyanmic lib: elf_entry: ld.so addr start_thread(regs,elf_entry,bprm->p);}
- In fact, the loading process is a breadth traversal, and the object to traverse is the "Tree of dependency".
The main process is dynamic linker completion, User Configuration completed.
(2) dynamic Link when loading
/*准备.so文件*/ shlibexample.h (1.3 KB) - Interface of Shared Lib Exampleshlibexample.c (1.2 KB) - Implement of Shared Lib Example/*编译成libshlibexample.so文件*/$ gcc -shared shlibexample.c -o libshlibexample.so -m32/*使用库文件(因为已经包含了头文件所以可以直接调用函数)*/SharedLibApi();
(3) dynamic link at runtime
dllibexample.h (1.3 KB) - Interface of Dynamical Loading Lib Exampledllibexample.c (1.3 KB) - Implement of Dynamical Loading Lib Example/*编译成libdllibexample.so文件*/$ gcc -shared dllibexample.c -o libdllibexample.so -m32/*使用库文件*/void * handle = dlopen("libdllibexample.so",RTLD_NOW);//先加载进来int (*func)(void);//声明一个函数指针func = dlsym(handle,"DynamicalLoadingLibApi");//根据名称找到函数指针func(); //调用已声明函数
(4) Operation
$ gcc main.c -o main -L/path/to/your/dir -lshlibexample -ldl -m32$ export LD_LIBRARY_PATH=$PWD /*将当前目录加入默认路径,否则main找不到依赖的库文件,当然也可以将库文件copy到默认路径下。*/
Third, the loading of the executable program 1. SYS_EXECVE kernel processing process (1) New executable program starting point
- Typically the address space is 0x8048000 or 0x8048300
(2) Execve and fork
execve和fork都是特殊一点的系统调用:一般的都是陷入到内核态再返回到用户态。
Fork two times back, the first return to the parent process continues to execute downward, the second is the child process returned to ret_from_fork and then normal return to the user state.
When the EXECVE executes, it falls into the kernel state, overwrites the currently executing program with the program loaded in the EXECVE, and returns to the new executable starting point when the system call returns.
Execve
- 执行到可执行程序 -> 陷入内核- 构造新的可执行文件 -> 覆盖掉原可执行程序- 返回到新的可执行程序,作为起点(也就是main函数)- 需要构造其执行环境;
- The shell calls Execve to pass the command-line arguments and environment parameters to the executable's main function, first the function call parameter passing, and then the system calls the parameter pass.
(3) Statically linked executable program and dynamically linked executable program EXECVE system calls are returned in different
- Static Link: elf_entry points to the head of the executable file, typically the main function, which is the starting point for the new program execution.
- Dynamic Link: elf_entry points to the starting point of the LD (dynamic linker), loading
load_elf_interp
2. Zhuang Zhou Dream Butterfly
庄周(调用execve的可执行程序)入睡(调用execve陷入内核),醒来(系统调用execve返回用户态)发现自己是蝴蝶(被execve加载的可执行程序)。
3. Loading of dynamically linked executables (1) where is the starting point for executable execution? How can I get the EXECVE system call back to the user state when the new program is executed?
- Modify the EIP of int 0x80 into the kernel stack by modifying the value of the EIP in the kernel stack as the starting point for the new program.
(2) How the Linux kernel supports a variety of different executable file formats
static struct linux_binfmt elf_format//声明一个全局变量 = {.module = THIS_MODULE,.load_binary = load_elf_binary,//观察者自动执行.load_shlib = load_elf_library,.core_dump = elf_core_dump,.min_coredump = ELF_EXEC_PAGESIZE,};static int __iit init_elf_binfmt(void){n register_binfmt(&elf_format);//把变量注册进内核链表,在链表里查找文件的格式 return 0;}
(3) Dynamic Link
- The executable needs to rely on a dynamic-link library, and the dynamic-link library may rely on other libraries to form a diagram-the dynamic-link library generates a dependency tree.
- Rely on the dynamic linker to load the library and parse it (this is the traversal of a graph), load all the required dynamic link libraries, and then LD give the CPU control to the executable program
- The process of dynamic linking is primarily a dynamic linker that works, not the kernel.
Linux kernel Analysis week seventh-executable program loading