Seventh airplanes loading of the execution program
I. Preprocessing, compiling, linking, and format of the target file 1. How the executable program came in.
C Code, preprocessing, assembly code, target code, executable file
. ASM Assembly Code
. O Target Code
a.out executable file
Preprocessing is responsible for including the include file and macro replacement work.
2. Format of the target file elf
(1) Common elf format files:
(2) abi--Application Binary interface
In the target file, he is already binary compatible, that is, adapting to binary directives.
(3) Three kinds of target files in Elf:
- A relocatable (relocatable) file holds the code and the appropriate data to create an executable file or a shared file with other object files. (mostly. o files)
- An executable (executable) file holds a program for execution, which indicates how EXEC (BA_OS) creates the process image of the program.
- A shared object file holds the code and the appropriate data to be linked by the following two linker. The first is the connection editor [see LD (SD_CMD)], which can be used to relocate and share an object file to create additional objects . The second is a dynamic linker that unites an executable file and other shared object files to create a process image. (mostly. so files)
(4) Target file format
On the left is the elf format, the right is the execution time format, where the ELF head describes the organization of the file, the program bid tells the system how to create a process memory image, section Header table contains information describing the file sections.
When creating or adding a process image, the system theoretically copies a segment of a file into a virtual memory segment.
Text segment copied to the beginning of the process, Data segment copied to a segment of the virtual address ...
There is a mapping between the executable file format and the process address space.
3. Statically linked elf executable file and process address space
An elf executable file is loaded into memory:
The first line of code that the executable is loaded into memory begins to execute , and is loaded by default from 0x8048000, and the actual entry of the program may be different due to different head sizes.
A general static link will put all the code in the same code snippet.
ii. executable programs, shared libraries, and dynamic links1. Work before loading the executable program
Typically we execute a shell environment for a program, and our experiment directly uses the EXECVE system call.
(1) $ ls-l/usr/bin Listing directory information under/usr/bin
LS is an executable program
- The shell itself does not limit the number of command-line arguments, the number of command-line arguments is limited by the command itself
We wrote the main function is willing to receive command line
willing to receive command line arguments int Main (intChar *argv[]) is also willing to receive shell-related environment variables int main (int Charchar *envp[]) //char *envp[] is automatically added by the shell command
(2) How the shell passes environment variables
The shell calls Execve to pass command-line arguments and environment parameters to the main function of the executable program.
int execve (constchar * filename,charconst argv[],char Const envp[]);
The library function exec* are EXECVE packages.
Example:
1. #include <stdio.h>2. #include <stdlib.h>3. #include <unistd.h>4.intMainintargcChar* argv[])//This is not a complete command function, no command line arguments are written5. {6.intpid;7./*Fork Another process* / //prevent the original shell program from being overwritten8. PID =Fork (); 9.if(pid<0) Ten. { One./*error occurred*/ A. fprintf (stderr,"Fork failed!"); -. Exit (-1); -. } the.Else if(pid==0) -. { -./*Child Process*/ -. EXECLP ("/bin/ls","ls", NULL);//With ls command as an example +. } -.Else +. { A./*Parent Process*/ at./*parent would wait for the*/ -. Wait (NULL); -. printf"Child complete!"); -. Exit0); -. } -.}
(3) How command-line arguments and environment variables are saved and passed
command-line arguments and environment strings are placed in the user-state stack
Shell program->execv->sys_execve
And then copy it when the new program stack is initialized.
- Function call parameter pass, then system call parameter Pass
2. Load-time dynamic link and runtime dynamic link application Example
Most executable programs rely on dynamic-link libraries.
Example:
Dynamic linking is divided into the dynamic link and runtime dynamic link when the executable program is loaded
- Prepare the. So file (dynamic link file under Linux)
- MAIN.C (1.9 KB)-Main Program
This provides only shlibexample-L (the directory where the interface header file for the library is located) and-L (the library name, such as libshlibexample.so to remove Lib and. So), and does not provide information about Dllibexample, but indicates the-LDL
$ gcc main.c-o main-l/path/to/your/dir-lshlibexample-ldl-m32$ export Ld_library_path=$PWD #将当前目录加入默认路径, Otherwise, main cannot find a dependent library file, and of course you can copy the library file to the default path. $ . / is a Main program! calling Sharedlibapi () function of libshlibexample.so! Call shared library is a shared libary! calling Dynamicalloadinglibapi () function of libdllibexample.so! Call the dynamic load Library is a dynamical Loading libary!
Iii. loading of executable programs 1. Analysis of key issues related to executable loader loading (1) Execve and fork are special system calls
- Normal system calls: fall into the kernel state, return to the user state, and execute the next instruction of the system call.
- Fork: Enter into the kernel state, two returns: the first time to return to the location of the parent process, continue execution. The second time, the child process starts execution from Ret_from_fork and then returns to the user state.
- Execve: When the current executable program executes to EXECVE, it falls into the kernel state, overwrites the current executables with the EXECVE load, and when the EXECVE system call returns, it is not the original system call, but the starting point of the new executable program. is the location of the main function.
(2) Sys_execve kernel processing process
SYS_EXECVE Internal Parse executable file format
- Do_execve, Do_execve_common, EXEC_BINPRM
- The Search_binary_handler conforms to the parsing module corresponding to the search file format, as follows:
1369 list_for_each_entry (FMT, & Formats, LH) {//Find modules in the list that can handle elf format 1370 if (!try_module_get (Fmt->module) continue ; 1372 read_unlock (&binfmt_lock); 1373 bprm->recursion_depth++; 1374 retval = Fmt->load_binary (BPRM); for Elf-formatted executable fmt->load_binary (BPRM), the implementation should be load_elf_binary its internal and elf file format parsing part needs and the ELF file format standard combined reading
1375 Read_lock (&binfmt_lock);
(3) How the Linux kernel supports a variety of different executable file formats
82Static structLinux_binfmt Elf_format ={//elf_foemat Structural body the. module =This_module, -. load_binary =load_elf_binary,//polymorphism mechanism, observer pattern -. Load_shlib =Load_elf_library, the. Core_dump =Elf_core_dump, the. Min_coredump =Elf_exec_pagesize, the};2198Static int__init Init_elf_binfmt (void)2199{2200REGISTER_BINFMT (&Elf_format); //把elf_format变量注册到fmt链表中
2201 return 0 2202}
Zhuangsheng Dream Butterfly
Zhuang Zhou (call Execve's executable program) to sleep (call Execve into the kernel), Wake up (System call Execve back to the user state) found himself a butterfly (executable program loaded by EXECVE)
Modifying an EIP for an int 0x80 into the kernel stack
Load_elf_binary-Start_thread
2.SYS_EXECVE Internal processing process
- The executable file that needs to be dynamically linked loads the connector LD first, otherwise the Elf file entry address is assigned to entry directly.
- Start_thread (regs, Elf_entry, bprm->p) will give the CPU control to the LD to load the dependent library and complete the dynamic link; for statically linked file Elf_entry is the starting point for new program execution
3. Using GDB to track the processing of SYS_EXECVE kernel functions
1. Update Menu
2. Looking at the test.c file, you can see that the exec system call has been added and its source code is similar to the previous fork
3. View makefile, find added Gcc-o Hello hello.c-m32-static, and add the two lines according to the video.
4.make Rootfs, found more exec function, and more than fork Hello world!
5. Freeze kernel, start GDB debugging, load symbol table, target remote
6. Set three breakpoints to start tracking
7. Start exec, stop here and start the system call
8. List and track
9. Run to Load_elf_binary and look at this part of the code
10. Address of the entry point against the Hello executable program
11. After entering, gradually follow, found in the pressure stack
3. Executable program and the story of Zhuangsheng Dream Butterfly 4. Analysis of dynamically linked executable program loading (1) What did the kernel do during the dynamic linking process?
- The executable needs to rely on the dynamic link library, and this dynamic link library may depend on other libraries, thus forming a tree structure;
- Elf_interpreter: You need to rely on the dynamic linker to load these libraries (LD) and parse, entry return the portal of the dynamic linker, load all the required dynamic link libraries, that is, the breadth traversal tree, and then LD give control of the CPU to the executable program entrance (Head Start position)
- The process of dynamic linking is primarily a dynamic linker, not a kernel.
Linux Kernel Analysis--Seventh week study note 20135308