First, theoretical knowledge
In Linux, you can produce an executable program from C source code, which is preprocessed, compiled, and linked. This process can be understood since:
Among them, the target file has at least compiled machine instruction code, data, also includes some information needed to link, such as symbol table, debugging information, string and so on. In Linux, the format of the executable file is now mostly in the ELF format (corresponding to the PE format in Windows). The ELF format is as follows:
Detailed description, see: Http://www.muppetlabs.com/~breadbox/software/ELF.txt, there is a Chinese version: http://www.xfocus.net/articles/200105/174.html
Linking is the process of collecting and organizing the different code and data required by the program so that the program can be loaded into memory and executed. The linking process is divided into two steps: 1. Space and address allocation, 2. Symbolic parsing and relocation.
In Linux, the execution of a program is done as a new process, using the EXECVE system call. The EXECVE corresponding system call is SYS_EXECVE, within which the executable file format is parsed. The corresponding kernel code, that is, in the Search_binary_handler to find the appropriate file format corresponding to the resolution module, the key code is as follows:
For elf files, retval = Fmt->load_binary (BPRM) actually executes Load_elf_binary, which is internally loaded with Elf files in the elf file format. Here, we can also see that Linux can support a variety of executable file formats, all the information in the format is stored in a linked list, where the load_binary is a function pointer, corresponding to the format of the executable file loading mode; To support a new executable file, You only need to register a new format struct with the linked list, which is similar to the observer pattern and is very extensible.
Second, the experimental process
Open the virtual machine in the lab building, run the following command in the shell, get the code for this experiment, compile and run
CD Linuxkernel
RM MENU-RF
git clone https://github.com/mengning/menu.git
CD Menu
MV TEST_EXEC.C test.c
Make Rootfs
The effect is as follows:
Close the Qemu window, in the Shell window, the CD Linuxkernel back to the Linuxkernel directory, start the kernel with the following command and stop for debugging before the CPU runs the code:
Qemu-kernel LINUX-3.18.6/ARCH/X86/BOOT/BZIMAGE-INITRD Rootfs.img-s-S
Next, we can split a new shell window horizontally, then start GDB debugging with the following command
Gdb
(gdb) file Linux-3.18.6/vmlinux
(GDB) Target remote:1234
and set breakpoints at the entrance of the system call SYS_EXECVE
(GDB) B sys_execve
Continue running the program, enter exec in the Qemu window, and the system will stop at the breakpoint set above.
Next we can step through the kernel code of SYS_EXECVE, or you can set the following breakpoint
b load_elf_binary
b start_thread
To completely track the process's creation and startup code!
Iii. Summary
The Linux system can start a new process via the Execve API, which also calls the SYS_EXECVE system call, is responsible for replacing the new program code and data into the new process, opening the executable file, loading the dependent library file, requesting a new memory space, and finally executing the Start_ Thread (regs, Elf_entry, bprm->p), set NEW_IP, NEW_SP, complete the new process of code and data substitution, and then return, followed by the execution of the new process code.
Linux Kernel Analysis 7