LAB6: Analyzing the process of creating a new process for the Linux kernel

Source: Internet
Author: User
Tags volatile

Li Junfeng + Original works reproduced please specify the source + "Linux kernel analysis" MOOC course

I. Experimental principles

1. Definition of the process

Process is the concept of the operating system, whenever we execute a program, for the operating system to create a process, in this process, along with the allocation and release of resources. A process can be thought of as a process of execution of a program.

2. The difference between process and procedure

When the program is static, it is an ordered set of instructions saved on disk, without any concept of execution.

A process is a dynamic concept that is the process of program execution, including creation, scheduling, and extinction. representation of processes in 3.linux systems in a Linux system, a process is described by a struct called task_struct, which means that each process in Linux corresponds to a task_struct struct. The structure records everything about the process. Let's take a look at its core fields.   struct Task_struct {   //This is the running state of the process, 1 means not running, 0 is operational, >0 representative has stopped. volatile long state;       /* Flags is the current status flag for the process, as follows:   0x0000 0002 Indicates that the process is being created   0x0000 0004 Indicates the process is preparing to exit   0x0000 0040 indicates that the process was forked out, but did not perform exec   0x0000 0400 Indicates that this process was killed due to other processes sending related signals     */ unsigned int flags;   //Indicates the run priority of this process unsigned int rt_priority;   //The structure records the status of process memory usage struct mm_struct *mm;   //process number, which is the unique identity of the process pid_t pid;   //Process group number pid_t Tgid;   //real_parent is the "biological father" of the process, whether or not it is "foster" struct task_struct *real_parent;   //parent is the parent process of the process now, possibly a "stepfather" struct task_struct *parent;   //Here children refers to the process child's linked list, can get all the child's process descriptor struct list_head children;   ///In the same vein, sibling the linked list of the sibling of the process, that is, all the children of its parent process struct list_head sibling;   //This is the process descriptor of the main thread, and you might wonder why the thread is represented by a process descriptor, because Linux does not implement the thread's related structure individually, only one process instead of the thread, and then does some special processing on it. struct task_struct *group_leader;   //This is the list of threads that the process is wired to struct List_head thread_group;   //This is the information that the process uses CPU time, Utime is the time that is executed in the user state, and the stime is executed in the kernel state. cputime_t utime,stime;   //comm is a character array that holds the name of the process for a maximum length of 15, since Task_comm_len is Char Comm[task_comm_len];   //Open file-related information structure struct files_struct *files;       //Handle to signal-related information struct signal_struct *signal; struct sigband_struct *sighand;   };   task_struct structure is very large, we do not need to understand all of its fields, just need to focus on the more important fields to pay attention to it. As can be seen from the above analysis, a process at least a little bit   (1) process number (PID), like our ID ID, is different for everyone. The process ID, too, is its only indication.   (2) Status of the process, identifying whether the process is in a running state, waiting state, stopping state, or dead state   A. Running state: The process is either running at this time, or ready to run   B. Wait state: At this point the process waits for an event to occur or some kind of system resource   C. Stop state: The process is terminated at this time   D. Death state: This is a terminated process, but it also occupies a task_struct structure in the process vector array.   (3) Priority and time slices of the process. Different priority processes are scheduled to run differently, typically high-priority processes run first. Time slices identify when a process will be run by the processor   (4) virtual memory Most processes have some virtual memory (kernel threads and daemons are not), and Linux must track how memory is mapped to system physical memory.   (5) Processor-related context a process can be thought of as the sum of the current state of the system. Whenever a process runs, it uses the processor's register, stack, and so on, which is the context of the process. Also, whenever a process is paused, all CPU-related contexts must be stored in the task_struct of the process. When the process is restarted by the scheduler, its context resumes from here. files in the 4.linux process each process in the Linux operating system has two data structures describing file-related information. First: Fs_struct, which contains the current working directory and root directory of this process, umask. Umask is the default mode in which new files are created, which can be changed by a system call.  The second: files_struct, which contains information about all the files that this process is using. The F_mode field describes what mode the file was created in: read-only, read-write, or write-only. F_pos saves the location where the next read or write in the file will occur. F_inode describe the VFS index node of a file, whereas F_ops is a pointer to a routine vector, each representing a function that wants to be applied to the action of the file.   each time a file is opened, one of the files_struct free file pointers is used to point to the new file structure. The Linux process starts with three file descriptors opened, which are standard input devices, standard output devices, and standard error devices, and are typically inherited from the parent process that created the process. All access to the file is made by standard system calls that pass or return a file descriptor. These descriptors are the index of the process fd vectors, so the standard input devices, standard output devices, and standard error devices correspond to file descriptors 0, 1, and 2, respectively. 5. Virtual memory in the process in the Linux operating system, when we run a two-level executable, the operating system creates a process. In this case, it would be wasteful to load all the code and data of this executable binary into physical memory. Because they cannot be used at the same time. As the number of processes in the system increases, this waste will be multiplied and the system will run very inefficiently. In fact, Linux uses a technique called request paging (demand-paging): its corresponding data is loaded into physical memory only when the process is using its virtual memory. Therefore, the code and data are not loaded directly into physical memory. The Linux kernel modifies only the page tables of the process, identifying the virtual memory pages that exist but their corresponding data is not in memory. When a process wants to access code or data, the system hardware generates a page fault and gives control to the Linux kernel to resolve. Therefore, for each memory area in the process address space, Linux needs to know where the virtual memory comes from and how to load it into memory to resolve the failure.     When a process allocates virtual memory, Linux does not really reserve physical memory for it. It simply creates a new VM_AREA_STRUCT data structure to describe the virtual memory, which is linked into the virtual memory list of the process. A page failure occurs when the process attempts to write a virtual address that is located in the newly allocated virtual memory area. The processor attempted to convert the virtual address, but because there is no page table entry for this memory, it discards and produces a page fault exception that is left to the Linux kernel for resolution. Linux to see if the referenced virtual address is the virtual memory address space located in the current process. If it is Linux, create the appropriate PTE and allocate a page of physical memory for this process. The code or data may need to be read into physical memory from the file system or the swap hard disk. The process can then be restarted from the instruction that caused the page failure, and because the physical address of the memory is present, it can continue to execute. If not, it is the "segment error" that we often see.

Two. Experimental steps

1. Add the following code to the TEST.C:

1 pid_t fpid;2printf"going to create a process.....\n");3Asmvolatile(4         "mov $0x2,%%eax\n\t"5         "int $0x80\n\t"6         "mov%%eax,%0\n\t"7:"=m"(fpid)8         );9printf"Have created a process\n");Ten         if(Fpid <0) One         { Aprintf"Error in fork!\n"); -         } -         Else if(Fpid = =0) the         { -printf"I am child,process ID:%d.\n", Getpid ()); -         } -         Else +         { -printf"I am parent,process ID:%d.\n", Getpid ()); +         } A         return 0;

2. Recompile the run with the results as shown:

3. Run the fork and the result is as follows:

4. Debug the operating system using GDB, as shown in:

5. Break the breakpoint in the following location, as shown in:

6. Debug the running program, observe the program running process:

Three. Experimental summary

The understanding of the experimental work topic is as follows:

  • Reading comprehension task_struct data structure;

    1. State: Running Status
    2. Stack: Kernel stack
    3. tasks: Process Chain List
    4. MM: Memory Management
    5. Task_state: Status of the task
    6. PID: Process PID
    7. Real_parent Children: Parent-child relationship of a process
    8. Files: List of file descriptors
    9. Signal: Signal Processing related
    10. Splice_pipe: Pipeline Related
  • Analyze the kernel processing process of the fork function Sys_clone, understand how to create a new process and how to create and modify task_struct data structure;

    1. Fork, Vfork, and clone three system calls can create a new process, all by calling Do_fork to implement the process creation
    2. Creating a new process requires copying a pcb:task_struct first
    3. Assign a new kernel stack to the new process
      TI = Alloc_thread_info_node (tsk, node); tsk->stack = Ti;setup_thread_stack (tsk, orig); This is just a copy of the Thread_info, not the kernel stack.
    4. And then modify the copied process data, such as PID, process chain list and so on, see copy_process internal
    5. See Fork () from the code of the user state; the function returned two times, that is, each time it is returned in a parent-child process
      1 *childregs = *current_pt_regs (); Copy kernel stack 2 childregs->ax = 0; Why the fork of the subprocess returns 0, here is the reason! 3  4 p->thread.sp = (unsigned long) childregs;//the kernel stack Top 5 P->thread.ip = (unsigned long) ret_from_fork;//dispatch to child process The first instruction address to the child process
    6. Do_fork completes most of the work in creation, which calls the Copy_process () function and then lets the process begin running. The copy_process () function works as follows:

      • 1, call Dup_task_struct () to create a kernel stack, thread_info structure, and task_struct for the new process, These values are the same as the values of the current process
      • 2, check
      • 3, and the child process begins to differentiate itself from the parent process. Many members within the process descriptor are cleared 0 or set to the initial value.
      • 4, the child process state is set to task_uninterruptible to ensure that it is not running
      • 5, Copy_ Process () calls Copy_flags () to update the flags member of the TASK_STRUCT. The PF_SUPERPRIV flag that indicates whether the process has superuser privileges is cleared 0. Indicates that the process has not called the EXEC () function Pf_forknoexec flag is set
      • 6, call Alloc_pid () to assign a valid PID to the new process
      • Strong>7, copy_process () copies or shares open files, file system information, signal processing functions, process address space, and namespaces, etc.
      • 8, according to the parameter flags passed to clone (), and finally, Copy_process () Does the finishing work and returns a pointer to the child process
  • Use GDB trace to analyze a fork system call kernel handler function Sys_clone to verify your understanding of creating a new process for Linux systems

    1. Function under breakpoint: Sys_clone do_fork dup_task_struct copy_struct copy_process copy_thread ret_from_fork
    2. Dup_task_struct () Create a kernel stack for the new process
    3. copy_process()主要完成进程数据结构,各种资源的初始化
  • Paying special attention to where the new process starts? Why does it go smoothly? That is, the execution starting point is consistent with how the kernel stack is guaranteed.

      1. ret_from_fork;Determines the first instruction address of the new process
      2. The statement in the Copy_thread () function p->thread.ip = (unsigned long) ret_from_fork; determines the address of the first instruction of the new process
      3. Before Ret_from_fork, that sentence in the Copy_thread () function *childregs = *current_pt_regs(); assigns the parent process's regs parameter to the child process's kernel stack
      4. The *childregs type is Pt_regs, which stores the parameters of the save all in the stack, so that it can be executed smoothly after the restore all.

This experiment is mainly to the fork system call debugging, the difficulty is not very big, but need ah remember a lot of things, I hope that they can all understand, refueling! (*^__^*)

LAB6: Analyzing the process of creating a new process for the Linux kernel

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.