loading of executable programs
20135109 Gao Yi Tong
"Linux kernel Analysis" MOOC course http://mooc.study.163.com/course/USTC-1000029000
I. Preprocessing, compiling, linking, and format of the destination file
1.1 How the executable program came from
- C code is compiled by a compiler, compiled into assembly code, compiled into object code by the compiler, and linked to an executable file.
- Preprocessing is responsible for the inclusion of the file included in the macro replacement and other work.
- The files after preprocessing are compiled into assembly code.
- Assembly code. S compiled into. O
1, 2 target file format elf
- Common file formats: a.out (oldest), COFF, PE (Windows system), ELF (Linux system)
- ABI application Binary interface, in the target file is already binary compatible format, the target file to adapt to a certain CPU architecture
- There are 3 main executables in the elf format: relocatable files. o, executable file, share destination file
- An elf head at the beginning of the file, a roadmap was saved, describing the organization of the document
- When creating or adding a process image, the system will theoretically copy the segment of a file into a virtual memory segment
1, 3 statically linked elf executable file and process address space
- Executable Elf loaded into memory: code data is loaded into memory
- Default Elf file loaded into 0x8048000
- The actual entry of the program 0x8048300 (start an entry point that starts executing after the executable file has just been loaded)
- A general static link places all the code in a single code snippet
- A dynamically linked process will have multiple sections of code
II. executable programs, shared libraries, and dynamic links
2, 1 action before loading the executable program
- Execution Environment of executable program: Shell command line, main function parameter and EXECVE parameter
- The shell itself does not limit the number of command-line arguments, the number of command-line arguments is limited by the command itself
- The shell calls Execve to pass command-line arguments and the environment to the main function of the executable program
- How command-line arguments and environment variables are saved and passed: Execve the original execution environment, the shell program-->execve-->sys_execve and then copies it when the new program stack is initialized
- Function call parameter pass, then system call parameter Pass
2, 2 load dynamic link and runtime dynamic link application Example
- Dynamic links are divided into: executable when loading dynamic link and runtime dynamic link
- Prepare. So file--compile to libshlbexample.so file (shared library)--Compile to libdllibexample.so file (dynamic mount)
- Compile main: Only Shellexample-L and-L are provided, and no information about Dllibexample is provided, only-LDL is indicated.
Iii. Loading of executable programs
Analysis of key issues related to loading of 3, 1 executable programs
- Execve and fork are special system calls, the current program execution to EXECVE system calls into the kernel state, Execve load the executable file to overwrite the current process, Execve return to the new executable execution starting point.
- int execve (load the command line and environment parameters in)
- Sys_execve Internal Parse executable file format: DO_EXECVE->DO_EXECVE_COMMON->EXEC_BINPRM
- Search_binaty_handle conforms to the file format, corresponding parsing module (according to the file header information to find the corresponding file format processing module)
- For an elf-formatted executable file, the load_elf_binary is executed.
- Elf_format and INIT_ELF_BINFNT are observers in the observer pattern.
- Look for the header file is the viewer, the elf format file appears, the viewer automatically executes the Elf_format module
- EXECVE system call returns where the user state begins execution: Load_elf_binary->start_thread (by modifying the value of an EIP in the kernel stack as the starting point for a new program)
3, the internal processing process of 2SYS_EXECVE
- Search_binary_handler (to find the handler for the executable) Fmt->load_binary (handler function for loading executables)
- REGISTER_BINFMT (register struct variable)
- Kernal_read Read File information
- The elf executable is mapped to the 0x8048000 address by default.
- Executables that need to be dynamically linked load the connector ld this shared library Load_elf_interp (the starting point for loading dynamic connectors)
- If a static link goes directly to Elf_entry,elf_entry is the starting point of the new program
- Start_thread (if it is a static link, point directly to main8048000; If the executable is a dynamic-link library, point to the starting point of the dynamic connector
3, 3 using GDB to track the internal process of SYS_EXECVE
Your own experiment:
(1) cloning, covering test.c
TEST.C File Code:
(2) When generating the root file system, put init hello into the rootfs address so that when the exec file is executed, the hello file is loaded automatically
(ii) using GDB to track the processing of SYS_EXECVE kernel functions
1. Load symbol table and connect to port 1234
2, set breakpoints: B Sys_execve (you can first stop at SYS_EXECVE and then set other breakpoints), B load_elf_binary,b start_thread.
3. Implementation
4, input C continue to execute, input instruction Exec,list view, press S can be traced to the inside of Do_execve.
3, 4 loading of executable program and the story of Zhuangsheng Dream Butterfly
- Zhuang Zhou (call Execve's executable program)
- Fall asleep (call Execve into kernel)
- Wake Up (System call EXECVE return to User state)
- Find yourself a butterfly (executable program loaded by EXECVE)
3, 5 analysis of dynamic link executable program loading
- Dynamic linking process, what does the kernel do? The dependency of the dynamic link library forms a graph
In the elf format,. interp and. Dynamic need to rely on the dynamics linker to parse
Entry return to the user state is not returned to the starting point of the executable program, return to the program portal of the dynamic linker
After loading and linking the LD gives control of the CPU to the executable program
- The loading process of a dynamic link library is a graph traversal process
Iv. Summary
1, the production of executable procedures:
The C language code----the compiler is preprocessed--compiled into assembly code--and the assembler compiles into the target code--linked to the executable file, which is then loaded into memory by the operating system.
2. There are 3 main executables in the elf format: relocatable files. o, executable file, share destination file.
3. The elf executable will be mapped to the 0x8048000 address by default.
4. How do command line arguments and environment variables enter the stack of the new program?
The shell program-->execve-->sys_execve and then copies it when the new program stack is initialized.
The function call parameter is passed before the system invokes the parameter.
5, the current program execution to EXECVE system calls into the kernel state, in the kernel with EXECVE load executable files, the current process of the executable file, Execve system calls back to the new executable starting point.
6, the dynamic link library loading process is a graph of the traversal process,
The. Interp and. Dynamic in the ELF format need to rely on the dynamic linker to parse, returning to the program entry of the dynamic linker when entry returns to the user state instead of returning to the starting point specified by the executable program.
Loading of executable programs