Loading and booting of executable programs for Linux kernel analysis

Source: Internet
Author: User

I. Content analysis

1. Creation of executable files

(1) pretreatment stage

The preprocessing process reads the source code, examines the statements and macro definitions that contain the preprocessing directives, and transforms the source code accordingly, and the preprocessing process removes comments and extra white space characters from the program. The preprocessing directives mainly include the following four aspects:

Macro definition Directives-the preprocessing process replaces the macro identifiers that appear in the source code with the values of the macro definition, two commonly used macro definitions:

// declare an identifier, all with uppercase letters to define the macro #define Max_num// # define directive with parameters (macro function)#define Cube (x)  ((x) * (x))int i,num=  1; I=cube (num);

Conditional compilation Directives--Define different macros to determine which code the compiler will process, and the conditional compilation directives determine which code is compiled and which are not.

The header file contains instructions--#include预处理指令的作用是在指令处展开被包含的文件. The program contains header files in two formats: #include <my.h> and # include "My.h"

Special symbols-The precompiled program can recognize some special symbols. Precompiled programs for these strings that appear in the source program are replaced with appropriate values,__file__,__line__,__time__, and so on.

The above phase corresponds to GCC-E-o hello.cpp hello.c-m32

(2) Compile stage

At this stage, GCC will first check the code's normative, whether there are grammatical errors, etc., to determine the actual work of the code to do, after checking the error, GCC translated the code into assembly language.

Corresponds to Gcc-x cpp-output-s-o hello.s hello.cpp-m32

(3) Assembly stage

The ". S" file generated during the compile phase is converted to a target file and a binary file is obtained.

Corresponds to Gcc-x assembler-c hello.s-o hello.o-m32

(4) Link stage

function library is generally divided into static library and dynamic library two kinds. Static library refers to the compilation of links, the library files of the code are all added to the executable file, so the resulting file is relatively large, but at run time also no longer need the library file. The dynamic library does not add the code of the library file to the executable when it compiles the link, but rather the library is loaded by the runtime link file when the program executes. The general suffix of the dynamic library is ". So", and GCC uses the dynamic library by default at compile time.

  Gcc-o Hello Hello.o-m32 (Dynamic compilation)

Gcc-o hello.static hello.o-m32-static (static compilation)

2.ELF executable file

(1) There are three categories of target files:

Relocatable file--The file holds the code and the appropriate data to create an executable file or a shared file with other object files.

Executable: An executable (executable) file holds a program for execution, which indicates how EXEC (Ba_os) creates the process image of the program.

Share the object file: A shared object file holds the code and the appropriate data to be linked by two linker, namely the connection Editor [LD (SD_CMD)] and the dynamic linker.

(2) Elf file format

Look at the elf file content with readelf-h hello.

Elf File Header

  The elf file is loaded from 0x8048000 by default, and the content of Entry point address in the ELF header is the actual entry for the program, which executes when a process that has just loaded the executable file is launched .

3. Execution Environment of executable program, dynamic link mode

The experiment uses the EXECVE system call directly. $ ls-l/usr/bin lists directory information under/usr/bin.

The shell itself does not limit the number of command-line arguments, and the number of command-line arguments is limited by the command itself. For example, int main (int argc, char *argv[]), and int main (int argc, char *argv[], char *envp[]).

The shell invokes Execve to pass command-line arguments and environment parameters to the main function of the executable, int execve (const char * filename,char * CONST argv[],CHAR * Const envp[]), library function ex The ec* is a EXECVE package routine.

Dynamic linking is divided into the dynamic link and runtime dynamic link when the executable program is loaded. Dynamic link loading is loading the contents of the relevant modules called in the dynamic library into memory when the function module is read into memory. Runtime dynamic linking is the loading of the corresponding modules in the dynamic library into memory when the program calls into the module content. Ii. contents of the experimentAfter updating the menu, cover with test.c test_exec.cthen open test.c,You can see that the EXEC command was added to perform the function of a program. Its function content is
intExec (intargcChar*argv[]) {    intpid; /*Fork Another process*/PID=Fork (); if(pid<0)     {         /*error occurred*/fprintf (stderr,"Fork failed!"); Exit (-1); }     Else if(pid==0)     {        /*Child Process*/EXECLP ("/bin/ls","ls", NULL); }     Else     {          /*Parent Process*/        /*parent would wait for the*/Wait (NULL); printf ("Child complete!"); Exit (0); }}

The changes were made in the makefile. The hello.c is executed at compile time, and the init and hello are placed in the Rootfs.img directory, so the EXEC command is the equivalent of automatically loading the Hello program.

GDB Pre-commissioning preparation can be summarized as:
$ cd linuxkernel/-rf$ git clone https://github.com/mengning/menu.git$ move TEST_EXEC.C test.c// View Makefile file It is possible to know that the experiment was statically compiled rootfs:        -o init linktable.c MENU.C test.c-m32-static -lpthread        -o Hello hello.c-m32-static
After that, the trace analysis is done through GDB.

  
After setting breakpoints such as B Sys_execve, b load_elf_binary and so on, enter exec for debugging analysis in Menuos. GDB first stops at the Sys_execve.

Then the Start_thread function.

You can now see whether the ingress address is consistent through the PO new_ip and the Readelf-h Helloc of the new window.

You can see that the entry address of Hello and the value of NEW_IP are 0x8048d0a. The description is linked to the Hello program in the execution program.

Continue stepping can see the new_ip copied to the Regs-> IP, and then proceed to see the Menuos interface output the corresponding Hello world.
Iii. Summary
The creation of executables includes preprocessing, compiling, assembling, and linking four stages.
Learn and understand the format of the elf file, learn the elf file header, paragraph Header table, text section and other components.
Through the tracing analysis of the source code, we understand the approximate flow of executable program loading. When executing to the EXECVE system call, into the kernel state, the executable file loaded with Execve to overwrite the current process, and when the EXECVE system call returns, returns the execution starting point of the new executable program.
In the case of a static link, Elf_entry points to the header specified by the executable (the location of the main function 0x8048***) if you need to rely on the dynamic link library, Elf_entry points to the starting point of the dynamic linker. Dynamic linking is mainly done by the dynamic linker ld.
An executable is an ordinary file that describes how to initialize a new execution context, that is, how to start a new calculation. There are many types of executable files, there is a list in the kernel, in the case of Init will be added to the list of supported executable program resolver, then when the implementation of the parse, from the end of the list of links to find the matching processing function can be resolved.
When an executable program is launched in the shell, a new process is created that takes the desired execution context by overwriting the process environment of the parent process (that is, the shell process) and emptying the user-state stack.
command-line arguments and environment variables are passed through the shell to Execve,excve through the system invocation parameters, passed to Sys_execve, and finally Sys_execve copied when the new process stack is initialized.
Load_elf_binary->start_thread (...) By modifying the value of the EIP in the kernel stack as the starting point for the new program. If the new program is dynamically linked, then you need to load the required library functions, the Dynamic Connector LD will be responsible for the loading process, the dynamic link library loading process is similar to a graph of the breadth-first traversal process, after the load is completed, the LD will give the CPU control to the executable program, continue to execute the executable program.
Liu Shuai
Original works reproduced please indicate the source
"Linux kernel Analysis" MOOC course http://mooc.study.163.com/course/USTC-1000029000

Loading and booting of executable programs for Linux kernel analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.