Track Linux kernel load and start executable procedure through GDB

Source: Internet
Author: User
Tags goto

Kuregaku Shandong Normal University

"Linux kernel Analysis" MOOC course http://mooc.study.163.com/course/USTC-1000029000

Experimental purpose: Through a simple executable program with GDB code tracking, how the Linux kernel dynamic and static load and start the program, and then summarize the Linux kernel executable program loading process.

First, the experimental process

1. Write a simple exec function to create the process

2. Open GdB and set the breakpoint as follows.

3, start tracking, find the first breakpoint.

(The main program has not created a child process yet)

4. Continue to track at this breakpoint gradually

5. Locate the second breakpoint set, and list

6, trace to the loading new_ip place, view its address

7. Obviously, the IP address loaded here is the same as the entry address of the program

8, end tracking, observe the other breakpoint method is similar.

Ii. loading and running of executable files

1, Execve () the entry of the system call is Sys_execve (). The code is as follows:

int Sys_execve (struct pt_regs regs) {    int error;    char * filename;    Copies the first parameter of the user space (that is, the path to the executable file) to the kernel    filename = getname ((char __user *) REGS.EBX);    Error = Ptr_err (filename);    if (Is_err (filename))        goto out;    Error = DO_EXECVE (filename,            (char __user * __user *) regs.ecx,            (char __user * __user *) Regs.edx,            &regs);    if (Error = = 0) {        task_lock (current);        Current->ptrace &= ~pt_dtrace;        Task_unlock (current);        /* Make sure we don ' t return using Sysenter. *        /Set_thread_flag (Tif_iret);    }    Free memory    putname (filename); Out:    return error;}
This shows that when the system is called, the parameters are placed sequentially in the EBX,ECX,EDX,ESI,EDI,EBP register.
Note that the first parameter is the path to the executable file, the second parameter is the number of arguments, and the third parameter is the parameter corresponding to the executable file.

2, Do_execve () is the main part of this system call, and its code is as follows:

int Do_execve (char * filename, char __user *__user *argv, char __user *__user *envp, struct Pt_regs * regs) {/    /LINUX_BINPRM: Save some parameters of executable file struct LINUX_BINPRM *bprm;    struct file *file;    unsigned long env_p;    int retval;    retval =-enomem;    BPRM = Kzalloc (sizeof (*BPRM), Gfp_kernel);    if (!BPRM) goto Out_ret;    Open the executable in the kernel file = open_exec (filename);    retval = ptr_err (file);    If Open fails if (is_err (file)) goto Out_kfree;    Sched_exec ();    Bprm->file = file;    Bprm->filename = filename;    Bprm->interp = filename;    BPRM initialization, mainly initializing bprm->mm retval = Bprm_mm_init (BPRM);    if (retval) goto out_file;    Calculate the number of parameters BPRM->ARGC = count (argv, max_arg_strings);    if ((retval = BPRM->ARGC) goto out_mm;    Number of environment Variables BPRM->ENVC = count (envp, max_arg_strings);    if ((retval = BPRM->ENVC) goto out_mm;    retval = Security_bprm_alloc (BPRM);    if (retval) goto out; Read the first 128 of the file to be loaded in BPrm->buf retval = PREPARE_BINPRM (BPRM);    if (retval goto out;    Copy of the first parameter, filename retval = Copy_strings_kernel (1, &bprm->filename, BPRM);    if (retval goto out;    Bprm->exec: Starting address of the parameter (from top to bottom) Bprm->exec = bprm->p;    Copy environment variable retval = copy_strings (BPRM->ENVC, ENVP, BPRM);    if (retval goto out;    Start address for environment variable storage env_p = bprm->p;    Copy executable file with parameters retval = copy_strings (BPRM->ARGC, argv, BPRM);    if (retval goto out;    The length of the environment variable Bprm->argv_len = env_p-bprm->p;    Find the appropriate loading module in the list retval = Search_binary_handler (Bprm,regs);        if (retval >= 0) {/* execve success */Free_arg_pages (BPRM);        Security_bprm_free (BPRM);        Acct_update_integrals (current);        Kfree (BPRM);    return retval;    }out:free_arg_pages (BPRM);    if (bprm->security) Security_bprm_free (BPRM); Out_mm:if (bprm->mm) mmput (bprm->mm); Out_file: if (Bprm->fiLe) {allow_write_access (bprm->file);    Fput (Bprm->file); }out_kfree:kfree (BPRM); Out_ret:return retval;}

3, when loading the executable file, you need to traverse formats this list, Search_binary_handler () to achieve this function. The code is as follows:

int Search_binary_handler (struct LINUX_BINPRM *bprm,struct pt_regs *regs) {int try,retval; struct LINUX_BINFMT *fmt; #ifdef __alpha__/* handle/sbin/loader.             */{struct EXEC * eh = (struct exec *) bprm->buf; if (!bprm->loader && eh->fh.f_magic = = 0x183 && (eh->fh.f_flags & 0x3000) =                   = 0x3000) {struct file * file;                   unsigned long loader;                   Allow_write_access (Bprm->file);                   Fput (Bprm->file);                   Bprm->file = NULL;                   loader = bprm->vma->vm_end-sizeof (void *);                   File = Open_exec ("/sbin/loader");                   retval = ptr_err (file);                   if (is_err (file)) return retval;  /* Remember If the application is TASO. */Bprm->sh_bang = eh->ah.entry bPrm->file = file;                   Bprm->loader = loader;                   retval = PREPARE_BINPRM (BPRM);                   if (retval return retval;         /* Should call Search_binary_handler recursively this, but it does not matter */}         } #endif retval = Security_bprm_check (BPRM);         if (retval) return retval; /* Kernel module Loader fixup */* so we don't try to load run modprobe in kernel space.         */Set_fs (USER_DS);         retval = AUDIT_BPRM (BPRM);         if (retval) return retval;         retval =-enoent;                   This will loop two times. After the module is loaded, then traverse for (try=0; try Read_lock (&binfmt_lock);  List_for_each_entry (FMT, &formats, LH) {//load function int (*FN) (struct                     LINUX_BINPRM *, struct pt_regs *) = fmt->load_binary;       if (!FN) continue;                            if (!try_module_get (fmt->module)) continue;                            Read_unlock (&binfmt_lock);                            Run the load function, and if the end of the load succeeds, continue traversing retval = FN (BPRM, regs);                                     The load succeeded if (retval >= 0) {put_binfmt (FMT);                                     Allow_write_access (Bprm->file);                                     if (bprm->file) fput (bprm->file);                                     Bprm->file = NULL;                                     Current->did_exec = 1;                                     Proc_exec_connector (current);                            return retval;                 } read_lock (&binfmt_lock);           PUT_BINFMT (FMT);                            if (retval! =-enoexec | | bprm->mm = = NULL) break;                                     if (!bprm->file) {read_unlock (&binfmt_lock);                            return retval;                   }} read_unlock (&binfmt_lock);                            All modules fail to load this executable, then load other modules and try again if (retval! =-enoexec | | bprm->mm = = NULL) {                            Break Config_kmod: Dynamic load Module flag #ifdef config_kmod}else{#define PRINTABLE (c) (((c) = = ' \ t ') | | ((c) = = ' \ n ') | | (0x20 if (printable (bprm->buf[0)) && printable (bprm-&                                GT;BUF[1]) && printable (bprm->buf[2]) &&                          Printable (bprm->buf[3]))           Break                   /*-enoexec */request_module ("binfmt-%04x", * (unsigned short *) (&bprm->buf[2]); #endif }} return retval;}

4, the process of waking the parent process, and the layout code for the stack space are as follows.

static int load_aout_binary (struct LINUX_BINPRM * BPRM, struct Pt_regs * regs) {.... current->mm->start _stack = (unsigned long) create_aout_tables ((char __user *) bprm->p, BPRM); #ifdef __alpha__ REGS->GP = E     X.a_gpvalue, #endif start_thread (regs, Ex.a_entry, Current->mm->start_stack); ......} The Creat_aout_tables () code is as follows: Static unsigned long __user *create_aout_tables (char __user *p, struct LINUX_BINPRM * bprm) {ch    Ar __user * __user *argv;    Char __user * __user *ENVP;    unsigned long __user *sp;    Number of arguments to an executable file int argc = bprm->argc;    Number of environment variables int ENVC = bprm->envc;    The SP is initialized to P, which is bprm->p sp = (void __user *) ((-(unsigned long) sizeof (char *)) & (unsigned long) p); #ifdef __sparc__ /* This imposes the proper stack alignment for a new process.    */sp = (void __user *) (((unsigned long) SP) &); if ((envc+argc+3) &1)--sp; #endif #ifdef __alpha__/* Whee. Test-programs is so much fun. */put_user (0,--sp);    Put_user (0,--SP);        if (Bprm->loader) {put_user (0,--SP);        Put_user (0x3eb,--SP);        Put_user (Bprm->loader,--SP);    Put_user (0x3ea,--SP);    } put_user (Bprm->exec,--SP);    Put_user (0x3e9,--sp); #endif sp-= envc+1;    ENVP = (char __user * __user *) SP;    SP-= argc+1; argv = (char __user * __user *) SP; #if defined (__i386__) | | Defined (__mc68000__) | | Defined (__arm__) | |    Defined (__arch_um__) put_user ((unsigned long) envp,--sp);    Put_user (unsigned long) argv,--sp); #endif put_user (argc,--sp);        Current->mm->arg_start = (unsigned long) p;        while (argc-->0) {char C;        Put_user (p,argv++);        do {get_user (c,p++);    } while (c);    } put_user (NULL,ARGV);    Current->mm->arg_end = Current->mm->env_start = (unsigned long) p;        while (envc-->0) {char C;        Put_user (p,envp++);        do {get_user (c,p++);    } while (c); } put_user (NULL,ENVP);    Current->mm->env_end = (unsigned long) p; return SP;}


IP is already pointing to the main function entry address, and thereafter the work is done by the Start_thread () function. The process can be found in one of my other blogs:

Http://www.cnblogs.com/wule/p/4404504.html

Third, summarize the Linux kernel executable program loading process

The parent process is created first, then a new process is created by calling the fork () system call, and then the new process calls the EXECVE () system call to execute the specified elf file. The main process continues to return waiting for the new process to finish executing, and then waits for the user to enter the command again. The EXECVE () system call is defined in Unistd.h, and its prototype is as follows:
int Execve (const char *filenarne, char *const argv[], char *const envp[]);
Its three parameters are executed by the program file name, execution parameters and environment change most. GLIBC EXECVP () system calls are packaged, providing 5 different forms of exec series APIs such as Execl (), EXECLP (), Execle (), Execv (), and EXECVP (), which differ only in the parameters of the call, But it will eventually be called to the Execve () system.

Call the EXECVE () system call, and then call the kernel's ingress Sys_execve (). Sys_execve () calls Do_execve () after some parameters are checked for replication. Because executables are more than just elf, there are Java programs and "#!" Start of the script and so on, so Do_execve () will first check the executed file, read the first 128 bytes, especially the beginning of 4 bytes of magic number, to determine the format of the executable file. If the script is an interpreted language, the first two bytes "#!" It makes up the magic number, and once the system determines the two bytes, the subsequent string is parsed to determine the path of the program interpreter.

When Do_execve () reads the 128-byte file header, then calls Search_binary_handle () to search for and match the appropriate executable file loading process. All supported executable formats in Linux have a corresponding loading process, and Search_binary_handle () determines the file's format by determining the number of magic in the head of the file, and invokes the appropriate loading process. such as Elf with Load_elf_binary (), a.out with Load_aout_binary (), script with Load_script (). The main steps of the elf loading process are:
① checks the validity of the elf executable format, such as the number of magic numbers and the middle of the program header (Segment).
② find the dynamically linked ". Interp" segment, which saves the path to the dynamic linker required by the executable, and sets the dynamic linker path.
③ maps elf files, such as code, data, and read-only data, according to the description of the Program Header table of the Elf executable.
④ initializes the ELF process environment, such as the address of the EDX register at the start of the process, which should be the address of the Dt_fini (end code address).
⑤ changes the return address of the system call to the entry point of the Elf executable, which depends on how the program is linked, and for statically linked elf executables, the entry is the address e_enery the file header of the Elf file, and for the dynamically linked elf executable file, The program entry point is a dynamic linker.
When the Elf is load_elf_binary () loaded, the function returns to Do_execve () in return to Sys_execve (). The return address of the system call in Load_elf_binary () (5th) has been changed to the entry address of the ELF program. So when the SYS_EXECVE () system call returns from the kernel state to the user state, the EIP register jumps directly to the ELF program's entry address, and the new program begins execution, and the elf executable is loaded.

To track the Linux kernel load and start executable procedure through GDB

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.