This article was reproduced from: http://www.cnblogs.com/biyeymyhjob/archive/2012/07/20/2601655.html
A glimpse of cow technology:
In a Linux program, fork () produces a child process that is exactly the same as the parent process, but the child process will then be called by the Exec system, and for efficiency reasons, the "copy-on-write" technique is introduced in Linux, that is, when the content of the segments of the process space is changed, The contents of the parent process are copied to the child process.
So the physical space of the child process has no code, how to fetch instructions to execute the EXEC system call?
Before exec after the fork two processes use the same physical space (memory area), the child process of the code snippet, data segment, stack are points to the parent process of the physical space, that is, the two virtual space is different, but its corresponding physical space is the same. When the parent-child process changes the corresponding segment of the behavior occurs, and then the corresponding segments of the child process to allocate physical space, if not because of the exec, the kernel will give the child process data segment, stack segment allocation of the corresponding physical space (so that both have their own process space, non-impact), The code snippet continues to share the physical space of the parent process (the code is exactly the same). And if it is because of exec, the code snippet for the child process will also be assigned a separate physical space because of the different code executed by the two.
On the internet there is a detail problem is that after fork the kernel will be placed in front of the queue, so that the child process first, so that the parent process will not cause the execution of the copy, and then the child process exec system calls, because of meaningless replication resulting in a decrease in efficiency.
Cow Details:
Now there is a parent process P1, this is a subject, then it is the soul of the body. Now in its virtual address space (with the corresponding data structure representation) There are: Body segment, data segment, heap, stack of the four parts, corresponding to the kernel to allocate the respective physical blocks for these four parts. That is: Body segment block, data segment block, heap block, stack block. As for how to allocate, this is what the kernel does, not in detail here.
1. Now P1 uses the fork () function to create a subprocess P2 for the process,
Kernel:
(1) Copy the body of the P1, data segment, heap, stack of the four parts, note that its contents are the same.
(2) for these four parts of the physical block, P2: Body segment->pi body section of the physical block, in fact, is not allocated to P2 body block, let P2 body segment point to P1 body block, the data segment->p2 its own data segment block (for which the corresponding block is allocated), heap->p2 their own heap block, Stack->p2 own stack blocks. As shown: the left-to-right direction arrows represent the copied content.
2. Copy-on-write technology: The kernel creates virtual space structures only for newly generated child processes, which replicate the virtual structure of the parent process, but do not allocate physical memory for those segments, share the physical spaces of the parent process, and then allocate the physical space for the corresponding segment of the child process when there is a change in the corresponding segment in the parent process.
3. Vfork (): This is a more popular approach, the virtual address space structure of the kernel even child process is not created, directly share the virtual space of the parent process, of course, this practice yielded shared the physical space of the parent process
Through the above analysis, I believe that we have a deep understanding of the process, it is how a layer of reflection of their own, the process is a subject, then it has the soul and body, the system must be to achieve it to create the corresponding entity, soul entity and physical entity. Both of them have corresponding data structures in the system, and the physical entity embodies its physical meaning. The following is quoted LKD
The traditional fork () system call directly copies all the resources to the newly created process. This implementation is too simple and inefficient because the data it copies may not be shared, and worse, if the new process intends to execute a new image immediately, all copies will be wasted. The fork () of Linux is implemented using the write-time copy (Copy-on-write) page. Write-time copying is a technique that can postpone or even eliminate copy data. The kernel does not replicate the entire process address space at this time, but instead lets the parent process and child processes share the same copy. Data is copied only when it needs to be written, so that each process has its own copy. In other words, the replication of a resource only takes place when it needs to be written, and before that, it is shared only as read-only. This technique enables a copy of the page on the address space to be deferred until the actual write occurs. In cases where the page is not written at all-for example, call EXEC () immediately after fork ()-they do not have to be duplicated. The actual cost of fork () is to copy the page table of the parent process and create a unique process descriptor for the child process. In general, an executable file is run immediately after the process is created, which avoids copying large amounts of data that are not used at all (often containing dozens of trillion of data in the address space). This optimization is important because UNIX emphasizes the ability of the process to execute quickly. One thing to add:Linux cow is not necessarily associated with exec
PS: In fact, cow technology is not only used in Linux process, other such as C + + string in some IDE environment also support cow technology, namely:
String str1 = "Hello World"; string str2 = str1;
Then execute the code:
str1[1]= ' q '; str2[1]= ' W ';
At the beginning of the two statements,str1 and str2 store the data address is the same, and after modifying the content,str1 address changed, and str2 address is the original, this is the application of cow technology in C + +, But VS2005 seems to have not supported cow.
Linux write-time copy technology "turn"