Linux0.11 Kernel--fork Process Analysis

Last Update:2016-06-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

It is said that in the Android application through the fork process of the way to prevent the application is killed, presumably the principle is that the child process is killed will send a signal to the parent process, do not drill down.

First, the fork () function is a system call in Sys.h:

extern int sys_fork ();//create process. (KERNEL/SYSTEM_CALL.S, 208)//system invoke function pointer table. Used for system call interrupt handlers (int 0x80), as a jump table. Fn_ptr sys_call_table[] = {sys_setup, sys_exit, Sys_fork, ...}

The previous article has done the detailed analysis to the system call, MAIN.C in:

static inline _syscall0 (int, fork)

__nr_fork is the 2 and 0x80 interrupt binding, just corresponding to the above array of sys_fork function, in the SYSTEM_CALL.S:

# # # # Sys_fork () call, used to create a subprocess, is the System_call feature 2. The prototype is in the Include/linux/sys.h. # First Call the C function find_empty_process () and get a process number PID. Returning a negative number indicates that the current task array # is full. Then call the copy_process () copy process: Align 2_sys_fork:call _find_empty_process # call Find_empty_process () (kernel/fork.c,135). Testl%eax,%eaxjs 1fpush%gspushl%esipushl%edipushl%ebppushl%eaxcall _copy_process # call C function copy_process () (kernel/fork . c,68). Addl $20,%esp # Discard all the contents of the stack here. 1:ret

First call find_empty_process to find the unused number in the task array, in FORK.C:

Gets the process number last_pid that is not duplicated for the new process, and returns the task number (array index) in the task array. int find_empty_process (void) {  int i;repeat://the PID number is re-used starting from 1 if the last_pid exceeds its positive representation range after 1 increases.  if ((++last_pid) < 0)    last_pid = 1;  Search in the task array whether the PID number you just set is already in use by any task. If yes, get a PID number again. For  (i = 0; i < nr_tasks; i++)    if (task[i] && task[i]->pid = = last_pid)      goto repeat;  Finds a free item in the task array for a new task and returns the item number. Last_pid is a global variable and does not have to be returned. For  (i = 1; i < nr_tasks; i++)//Task 0 excluded.    if (!task[i])      return i;  If all 64 items in the task array are already occupied, the source code is returned.  Return-eagain;}

This function is good to understand, next see find_empty_process return value is saved in eax, if negative is directly out of sys_fork, otherwise push a bunch of instructions, as copy_process parameters, also in FORK.C:

/** OK, here is the main fork sub-program. It replicates the system process information (Task[n]) and sets the necessary registers. * It also replicates data segments entirely. *///the replication process. Where the parameter nr is called Find_empty_process () assigns the task array item number. None is the return address that is pressed into the stack when called//sys_call_table in SYSTEM_CALL.S. intcopy_process (int nr, long ebp, long edi, long esi, long GS, long none, long ebx, long ecx, long edx, long FS  , Long es, long ds, Long EIP, long cs, long eflags, long ESP, long ss) {struct task_struct *p;  int i;  struct file *f;  p = (struct task_struct *) get_free_page ();//allocates memory for new task data structures.    if (!p)//If a memory allocation error occurs, an error code is returned and exits.  Return-eagain;  TASK[NR] = p;//The new task structure pointer into the task array.  where NR is the task number, which is returned by the front find_empty_process (). *p = *current;/* note! This doesn ' t copy the supervisor stack *//* Note!    Doing so does not replicate the super user's Stack *///(copy only the current process content).  P->state = task_uninterruptible;//Resets the state of the new process to the non-interruptible wait state. P->pid = last_pid;//New process number.  is obtained by calling Find_empty_process () earlier.  P->father = current->pid;//Sets the parent process number.  P->counter = p->priority;  p->signal = 0;//signal bitmap 0.  P->alarm = 0;//Alarm Timer value (tick count). P->leader = 0;/* Process leaderShip doesn ' t inherit *//* process leadership is not inherited */P->utime = P->stime = 0;//Initialize the user state time and the kernel mentality time.  P->cutime = P->cstime = 0;//Initializes the child process user state and kernel mentality time. P->start_time = jiffies;//The current tick count time.  The following sets the data required for the task status segment TSS (see the list below).  P->tss.back_link = 0; Because the task structure P is assigned 1 pages of new memory, at this point esp0 exactly to the top of the page.  The SS0:ESP0 is used as a stack when the program//is executed in the kernel state.  P->tss.esp0 = Page_size + (long) p;//the kernel-state stack pointer (because the task structure P is assigned 1 pages//new memory, so at this point esp0 just points to the top of the page).  P->tss.ss0 = 0x10;//Stack segment selector (same as kernel data segment) [??].  P-&GT;TSS.EIP = eip;//instruction code pointer.  P->tss.eflags = eflags;//Flag register.  P->tss.eax = 0;//This is why the new process returns 0 when fork () returns.  P-&GT;TSS.ECX = ECX;  P->tss.edx = edx;  P-&GT;TSS.EBX = EBX; P-&GT;TSS.ESP = esp;//The new process completely replicates the stack contents of the parent process.  Therefore, the stack of task0 P-&GT;TSS.EBP = ebp;//is required to be "clean".  P->tss.esi = ESI;  P->tss.edi = EDI;  p->tss.es = es & 0xffff;//segment registers only 16 bits are valid.  P->tss.cs = cs & 0xFFFF;  P-&GT;TSS.SS = ss & 0xFFFF;  P->tss.ds = ds & 0xFFFF;  P->tss.fs = FS & 0xFFFF;  P->tss.gs = GS & 0xFFFF; P->tss.ldt = _ldt (NR);//Set the selector for the local descriptor of the new task (the LDT descriptor is in the GDT).  P->tss.trace_bitmap = 0x80000000;//(high 16 bits active). If the current task uses a coprocessor, its context is saved. The assembly instruction CLTs is used to clear the Task/interchange (TS) flag in the control register CR0. Each time a task switch occurs, the CPU sets the flag. This flag is used to manage the math coprocessor: if//The flag is set, then each ESC instruction will be captured. If the Coprocessor presence flag is also set, then the//WAIT instruction is captured. Therefore, if a task switch occurs after an ESC instruction starts executing, the contents of the coprocessor may need to be saved before the new ESC instruction is executed.  The error handling handle saves the contents of the coprocessor and resets the TS flag.    The instruction Fnsave is used to save all the state of the coprocessor to the memory area specified by the destination operand (tss.i387). if (Last_task_used_math = = current) __asm__ ("CLTs; Fnsave%0 "::" M "(p->tss.i387));//Set the code and data segment base for the new task, limit the length, and copy the page table.  If an error occurs (the return value is not 0), the task array is reset//corresponding and the memory page allocated for the new task is released.      if (Copy_mem (NR, p)) {//Return not 0 indicates an error.      TASK[NR] = NULL;      Free_page ((long) p);    Return-eagain;  }//If a file is open in the parent process, the number of open files for the corresponding file is increased by 1. for (i = 0; i < Nr_open; i++) if (f = p->filp[i]) f->f_count++;//the PWD, root, and executable references for the current process (parent process) are increased  1.  if (CURRENT-&GT;PWD) current->pwd->i_count++;  if (current->root) current->root->i_count++; if (current->executable) current->executable->i_count++;//sets the TSS and LDT descriptor entries for the new task in the GDT, and the data is taken from the task structure.  The Task register TR is automatically loaded by the CPU when the task is switched.  Set_tss_desc (Gdt + (nr << 1) + First_tss_entry, & (P-&GT;TSS));  Set_ldt_desc (Gdt + (nr << 1) + First_ldt_entry, & (P->ldt)); P->state = task_running;/* Do the last, the just in case//////////////////////////To finally set the new task to a running state, just in case */return last_pid;// ). }

Here are some questions to note, why Copy_process has so many parameters, and sys_fork only push 5 registers, because according to the system call mechanism, call Sys_fork before the System_call function is called First, The stack has been pushed into a pile of registers, which corresponds.

First allocate memory for the new task data structure (note here is the data structure is not the task itself), get_free_page in the back of the memory management article analysis, the fork function and memory management MEMORY.C is closely related. Just know this function is to get to the main memory area of a page of free page and return the address of this page.

The next good thing to understand is to copy the process descriptor of the current process into the new task and reassign the individual properties. It is noteworthy here that P->father = Current->pid indicates that the parent process of the new task is the current process.

Next, set esp0 points to the top of the newly allocated page memory , SS0 for the kernel data segment, because the base address in the kernel data segment descriptor is 0, so ss0:esp0 is used as the stack for the program's kernel execution.

Next P->tss.ldt = _ldt (NR); Sets the index number of the LDT, which is the selection of the LDT in the GDT.

The following are the most critical function Copy_mem:

Set a new task's code and data segment base, limit the length, and copy the page table. NR is a new task number; P is a pointer to a new task data structure.  Intcopy_mem (int nr, struct task_struct *p) {unsigned long old_data_base, new_data_base, Data_limit;  unsigned long old_code_base, new_code_base, Code_limit;  Takes the length (in bytes) of the segment of the descriptor entry in the current process's local descriptor table.  Code_limit = Get_limit (0x0f);//Take the code snippet descriptor in the local descriptor table to the middle of the length limit.  Data_limit = Get_limit (0x17);//Take the length of the segment descriptor in the local descriptor table.  Takes the base address of the current process code snippet and data segment in the linear address space.  Old_code_base = Get_base (current->ldt[1]);//Take the original code snippet base address.  Old_data_base = Get_base (current->ldt[2]);//Take the original data segment base address.    if (old_data_base! = old_code_base)//version 0.11 does not support the case of code and data segment separation.  Panic ("We don ' t support separate i&d");    if (Data_limit < code_limit)//If the data segment length < Code snippet length is also incorrect.  Panic ("Bad data_limit");  Create a new process in the linear address space where the base address is equal to 64MB * its task number.  New_data_base = New_code_base = nr * 0x4000000;//new base address = task number *64MB (task size).  P->start_code = New_code_base;  Sets the base address in the middle descriptor of the new process local descriptor descriptor.  Set_base (p->ldt[1], new_code_base);//Set the base Address field in the code snippet descriptor.  Set_base (p->ldt[2], new_data_base);//sets the base Address field in the data segment descriptor. Sets the page Catalog table entries and page table entries for the new process.That is, the linear address memory page of the new process corresponds to the actual physical Address memory page.      if (Copy_page_tables (Old_data_base, New_data_base, Data_limit)) {//Copy code and data segment.      Free_page_tables (New_data_base, data_limit);//If there is an error, release the requested memory.    Return-enomem; } return 0;}

First take the length of the code and data segment descriptor in the local descriptor descriptor (LDT itself descriptor), in Sched.h:

Takes the segment selector segment's length value. %0-the length of the storage (bytes);%1-segment selector segment. #define GET_LIMIT (segment) ({unsigned long __limit; __asm__ ("Lsll%1,%0\n\tincl%0": "=r" (__limit): "R" (segment)); __limit;})

Because in the process descriptor structure there is a

struct desc_struct ldt[3];//struct desc_struct ldt[3] Local table descriptor for this task. 0-Empty, 1-code snippet cs,2-data and stack segment DS&SS.

This represents the LDT descriptor itself, the first descriptor is empty, the second descriptor is 8-15 bytes is the code snippet, and because the descriptor 0-15 bits is the length of the segment, so take the 0x0f, and then the third descriptor is 16-23 bytes is the data segment, so take 0x17.

Next is the base address of the code snippet that takes the LDT of the current process:

Take the Subgrade address from the descriptor at address addr. The function is just the opposite of _set_base (). EDX-Storage Base address (__base);%1-Address addr offset 2;%2-Address addr offset 4;%3-addr offset 7. #define _get_base (addr) ({unsigned long __base; __asm__ ("Movb%3,%%dh\n\t" \//take [addr+7] at the high 8 bits (bit 31-24) of the base 16-bit.? Dh.  "Movb%2,%%dl\n\t" \//take [addr+4] Place base high 16 bit low 8 bit (bit 23-16)?? Dl.  "Shll $16,%%edx\n\t" \//base site high 16-bit moved to edx high 16 places.  "MOVW%1,%%dx" \//take [addr+2] Place base low 16 bit (bit 15-0)?? Dx. : "=d" (__base) \//so that edx contains a 32-bit subgrade address. : "M" (* ((addr) + 2), "M" (* ((addr) + 4)), "M" (* ((addr) + 7))); __base;}) Takes the base address in the segment descriptor of the LDT in the local descriptor table. #define GET_BASE (LDT) _get_base ((char *) & (LDT))

CURRENT->LDT[1] is the contents of the code snippet descriptor entry for the current process's LDT, so it is not difficult to understand that the base address is extracted from the contents of the descriptor item.

Next, set the base address of the new process's linear address, Linus to each program (process) divided by 64MB of virtual memory space , so the new base is the task number *64MB.

Then there is the base address set for the segment descriptor in the LDT table of the new process, similar in principle.

Copy_page_tables and Free_page_tables to a later explanation.

At the very end is the TSS and LDT descriptor entries that set up the new task, as explained in the initialization of process scheduling.

Finally, the new process number is returned.

This concludes the analysis of the fork function.

Linux0.11 Kernel--fork Process Analysis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux0.11 Kernel--fork Process Analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux0.11 Kernel--fork Process Analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support