Original works reproduced please specify the source + "Linux kernel analysis" MOOC course http://mooc.study.163.com/course/USTC-1000029000
The main content of this week is the loading of executable programs.
First, let's take a look at the process of compiling links and the elf executable file format
1. This diagram concisely describes the production of executable procedures.
? The approximate process is this:
?. c files are compiled into assembly code. ASM,
And then assemble it into the target code. O,
Then link it to the executable file a.out,
The executable file can then be loaded into memory and executed.
2. For example (compile a link to Hello World C file):
? specific process
? 3, elf executable file format
(1) Three main target files in the Elf
? relocatable files:. o files
Shared destination file:. So file (link editor = static linker)
(2) file format
(3) View the header of the executable file (readelf Directive)
Indicates the version number, the Os/abi, the version of the ABI, the executable file or the destination file, the entry address (the starting point of the program).
(4) Statically linked elf executable file and process address space
The entry address is 0x (not unique);
The x86 system has a 4G process address space (the preceding 1G: the kernel is used; After that: the user state is accessible);
When an elf executable file is loaded into memory:
First load the code snippet and data segment into it (by default, starting from the 0x position);
Start loading, the front is the elf format header information, the size is different, according to the size of the head can determine the actual entrance of the program;
When you start a process that has just loaded an executable file, you can start execution from this location.
Second, the process of loading an executable file using the exec* library function
Note: For static links, you can work as long as you pass command-line arguments and environment variables, but for the vast majority of executable programs, there are some dependencies on dynamic-link libraries.
? Dynamic links are divided into executable programs when loading dynamic links and runtime dynamic links.
(1) Examples in reference video
In the main function, the Dlopen is required to dynamically load the shared library;
A function pointer "*func" is declared;
Locate the function name and assign it to the pointer based on the function name.
This allows you to use the functions defined in the shared library.
(2) How to compile and execute?
Iii. a little mention of key issues regarding the loading of executable programs before following the analysis
? 1, Sys_execve kernel processing process:
Do_execve-> Do_execve_common-EXEC_BINPRM//Call Order
2. The observer and the person being observed
3, the internal processing process of SYS_EXECVE?
? This is the entry for the system call:
DO_EXECVE:
1550 A pointer to the behavior of the user state;
Line 1553 turns the command-line argument into a structure.
Do_execve_common:
There are several key areas:
(1) Do_open_exec:
Opens the executable file to be loaded and loads its file header.
A struct BPRM has been created:
Rows 1505 and 1509 copy both the environment variable and the command-line arguments into the struct;
The 1513-line start is the process of the executable file.
(2) EXEC_BINPRM:
The key code is search binary handler (looking for the handler function for this executable):
In which the more critical code:
In this loop, look for the code that resolves the current executable file.
The 1374 behavior loads the handler function of the executable, which is actually called the Load_elf_binary function:
Find out what you can find:
48 lines for this function prototype;
84 behavior Assignment statement;
571 implementation of the behavior function.
The code to assign the value:
This is a struct-body variable.
How this struct variable is entered into the kernel's processing module:
This function registers the struct variable into the list,
So when an elf-formatted file is present, it is found in the list to find the struct variable.
The function implements code that is more critical than the code:
Note: The elf executable is mapped to the 0x address by default.
If you enter this statement, it means that it needs to rely on other dynamic libraries (not statically linked executables);
It will load the LOAD_ELF_INTERP (dynamic link library dynamic Link file), the starting point of the dynamic linker.
? If it is a static link, you can assign the following values directly:
There are two possible ways to find the Start_thread:
Summary:
In the case of a static link, the Elf_entry points to the header specified in the executable file, that is, the location of the main function, which is the starting point for the new program execution;
If it is a dynamic link that needs to rely on other dynamic libraries, Elf_entry is the starting point for the dynamic linker.
Finally, we use GDB to trace and analyze the call kernel processing function of a EXECVE system Sys_execve
? (1) First remove the menu, clone a copy after entering the menu, covering test.c after entering the view
(2) test.c
An additional sentence of menuconfig (exec) can be found;
and fork source code is different only to add a sentence execlp.
(3)? Makefile
Can be found in the compilation of hello.c;
It then puts both the init and the hello into the rootfs.img when generating the root file system.
(4) Make Rootfs
Found an increase in exec.
(5) Start using GDB tracking?
(6) Set breakpoints
(7) Start execution
found that the program was executed after 3 C was completed.
(8) Executive exec?
Found that it was carried out to this place and stopped.
Note: Do not know what is the reason, each execution to this step of the experimental building is completely stuck. This is also the result of restarting the experiment. So the following steps are shown in the video. (Too much, omit a little bit properly)
(9) List, then continue s tracking
The discovery traces to the DO_EXECVE place;
(ten) C continue to implement
? to the Load_elf_binary place;
list, and then continue n tracking:
At this point trace to Start_thread.
Question: Where exactly does New_ip point?
? (11) Split horizontally again, then execute the readelf command
New_ip can be found pointing to the entry point address.
(12) After closing the new window, press N to execute and then keep s
You can see that new_ip and NEW_SP are assigned, and a new stack is set up.
? (13) End by C execution
What did the kernel do in the process of summarizing dynamic links?
1. The dependency of the dynamic link library will form a graph?
2. Question: Is the kernel responsible for loading the dynamic link libraries that the executable program relies on?
Answer: No. This is done by the dynamic Connector (part of the libc), which is what the user state does.
? 3. The loading process of a dynamic link library is a graph traversal
4, dynamic Link: By the LD to dynamically link executable program, complete various work and then transfer control to the entrance of the executable program, can execute the program and then execute.
Linux 7th Experiment--Xie Fei Sail