Copy-on-write technology was originally generated by UNIX systems to implement a fool-like process creation: when a fork () system call is made, the kernel copies the entire address space of the parent process as-is and assigns the copied copy to the child process. This behavior is very time-consuming because it requires:
· Assigning pages to a child process's page table
· Assigning pages to a child process page
· Initializing a page table for a child process
· Copy the page of the parent process to the corresponding page of the child process
This approach to creating an address space involves many memory accesses, consumes many CPU cycles, and completely destroys the content in the cache. In most cases, this is often meaningless because many child processes start their execution by loading a new program, which completely discards the inherited address space.
The Unix kernel (including Linux) now has a more efficient method called write-time replication (or cow). The idea is quite simple: the parent process and the child process share the page instead of copying the page. However, they cannot be modified as long as the pages are shared. Whenever the parent and child processes attempt to write a shared page, an error occurs, and the kernel copies the page to a new page and marks it as writable. The original page is still write-protected: When another process attempts to write, the kernel checks whether the write process is the only owner of the page, and if so, it marks the page as writable for the process.
1. Linux fork () uses copy-on-write
The traditional fork () system call directly copies all the resources to the newly created process. This implementation is too simple and inefficient because the data it copies may be shared (this approach is significantly naïve and inefficient in the IT copies much data that might otherwise be shared.). Even worse, if the new process is going to execute a new image immediately, all copies will be wasted. The fork () of Linux is implemented using the write-time copy (Copy-on-write) page. Write-time copying is a technique that can postpone or even avoid copying data. The kernel does not replicate the entire process's address space at this time, but instead lets the parent-child process share the same address space. The address space is copied only when it needs to be written, allowing each to have its own address space. That is, the replication of a resource is done only when it is required to be written, before which it is shared only in read-only mode. This technique enables a copy of the page on the address space to be deferred until the actual write occurs. If the page is not written at all---for example, EXEC () executes immediately after fork (), and the address space does not have to be copied. The actual cost of fork () is to copy the page table of the parent process and create a process descriptor for the child process. In general, the process is created immediately after the execution of an executable file, this optimization, you can avoid copying a large number of data that is not used at all (often contains dozens of trillion of data in the address space). This optimization is important because UNIX emphasizes the ability of the process to execute quickly.
A glimpse of cow technology:
In a Linux program, fork () produces a child process that is exactly the same as the parent process, but the child process will then be called by the Exec system, and for efficiency reasons, the "copy-on-write" technique is introduced in Linux, that is, when the content of the segments of the process space is changed, The contents of the parent process are copied to the child process.
So the physical space of the child process has no code, how to fetch instructions to execute the EXEC system call?
Before exec after the fork two processes use the same physical space (memory area), the child process of the code snippet, data segment, stack are points to the parent process of the physical space, that is, the two virtual space is different, but its corresponding physical space is the same. When the parent-child process changes the corresponding segment of the behavior occurs, and then the corresponding segments of the child process to allocate physical space, if not because of the exec, the kernel will give the child process data segment, stack segment allocation of the corresponding physical space (so that both have their own process space, non-impact), The code snippet continues to share the physical space of the parent process (the code is exactly the same). And if it is because of exec, the code snippet for the child process will also be assigned a separate physical space because of the different code executed by the two.
On the internet there is a detail problem is that after fork the kernel will be placed in front of the queue, so that the child process first, so that the parent process will not cause the execution of the copy, and then the child process exec system calls, because of meaningless replication resulting in a decrease in efficiency.
Cow Details:
Now there is a parent process P1, this is a subject, then it is the soul of the body. Now in its virtual address space (with the corresponding data structure representation) There are: Body segment, data segment, heap, stack of the four parts, corresponding to the kernel to allocate the respective physical blocks for these four parts. That is: Body segment block, data segment block, heap block, stack block. As for how to allocate, this is what the kernel does, not in detail here.
1. Now P1 uses the fork () function to create a subprocess P2 for the process,
Kernel:
(1) Copy the body of the P1, data segment, heap, stack of the four parts, note that its contents are the same.
(2) for these four parts of the physical block, P2: Body segment->pi body section of the physical block, in fact, is not allocated to P2 body block, let P2 body segment point to P1 body block, the data segment->p2 its own data segment block (for which the corresponding block is allocated), heap->p2 their own heap block, Stack->p2 own stack blocks. As shown: the left-to-right direction arrows represent the copied content.
2. Copy-on-write technology: The kernel creates virtual space structures only for newly generated child processes, which replicate the virtual structure of the parent process, but do not allocate physical memory for those segments, share the physical spaces of the parent process, and then allocate the physical space for the corresponding segment of the child process when there is a change in the corresponding segment in the parent process.
3. Vfork (): This is a more popular approach, the virtual address space structure of the kernel even child process is not created, directly share the virtual space of the parent process, of course, this practice yielded shared the physical space of the parent process
Through the above analysis, I believe that we have a deep understanding of the process, it is how a layer of reflection of their own, the process is a subject, then it has the soul and body, the system must be to achieve it to create the corresponding entity, soul entity and physical entity. Both of them have corresponding data structures in the system, and the physical entity embodies its physical meaning.
Add:Linux cow does not necessarily relate to exec
PS: In fact, cow technology is not only used in Linux process, other such as C + + string in some IDE environment also support cow technology, namely:
String str1 = "Hello World"; string str2 = str1;
Then execute the code:
str1[1]= ' q '; str2[1]= ' W ';
At the beginning of the two statements, STR1 and STR2 storage data address is the same, and after modifying the content, STR1 address changed, and STR2 address is original, this is the application of cow technology in C + +, but VS2005 does not seem to support cow.
2. Fork () function
Header file
[OBJC]View Plaincopy
- #include <unistd.h>
- #include <sys/types.h>
Function prototypes
[OBJC]View Plaincopy
- pid_t fork ( void);
(pid_t is a macro definition whose essence is that int is defined in #include<sys/types.h>)
Return value: Two values are returned if the call is successful, the child process returns 0, the parent process returns the child process ID; otherwise, an error returns-1
Formula: Parent return, 0,fork error return-1
Sample code
[OBJC]View Plaincopy
- #include <sys/types.h>//This header file is not available for this program
- #include <unistd.h>
- #include <stdio.h>
- #include <stdlib.h>
- int main (int argc, charchar * * argv) {
- //Because it will return two times, the following code will be executed two times
- //If the child process is created successfully:
- //1. The parent process returns the child process ID, so (the parent process) walks through "Branch 3"
- //2. The child process returns 0, so (the child process) goes through "branch 2"
- pid_t pid = fork ();
- if (PID < 0) { //branch 1
- fprintf (stderr, "error!");
- }Else if ( 0 = = pid) {//branch 2
- printf ("This was the child process!");
- _exit (0);
- }else{//Branch 3
- printf ("This is the parent process! Child process id =%d ", PID);
- }
- //May be required when the wait or Waitpid function waits for the end of the child process and gets the end state
- Exit (0);
- }
Attention! The sample code is for reference only, and the sample code has the possibility that the parent process ends before the child process ends. When necessary, you can use the wait or Waitpid function to have the parent process wait for the child process to end and get the return status of the child process.
Another feature of fork is that all descriptors opened by the parent process are copied into the child process. The same numbered file descriptors in the parent and child processes point to the same file struct in the kernel, which means that the reference count of the file struct is incremented .
3. Linux fork () uses copy-on-write (details)
The fork function is used to create a child process, typically called once, returning a function of two times, which returns the PID of the child process and 0, where the calling process returns the PID of the child process, and the child process returns 0, which is a more interesting function, but the order of execution of two processes is variable. After the fork () function call is complete, the virtual storage space of the parent process is copied to the virtual storage space of the child process, so the shared files are also implemented. However, the virtual storage space is mapped to the physical storage space using the write-time copy technology (the specific operation size is controlled by the page), the technology is mainly in the multi-process of the same object (data) in the physical storage in which there is only one physical storage space, and when one of the process attempts to write operations on the region, The kernel creates a new physical page in the physical memory, copies the content of the area that needs to be written to the new physical page, and writes the new physical page. This is the implementation of the operation of the different processes without affecting the other processes, but also save a lot of physical memory.
C code
[OBJC]View Plaincopy
- #include <stdio.h>
- #include <stdlib.h>
- #include <unistd.h>
- #include <fcntl.h>
- #include <sys/types.h>
- #include <sys/stat.h>
- int main () {
- char p = ' P ';
- int number = 11;
- if (fork () = =0)/ * Child process * /
- {
- p = ' C '; / * Child process modification of data * /
- printf ("p =%c, number =%d \ n", p,number);
- Exit (0);
- }
- / * Parent Process * /
- Number = 14; / * Parent Process modification to data * /
- printf ("p =%c, number =%d \ n", p,number);
- Exit (0);
- }
[OBJC]View Plaincopy
- $ gcc -g testwritecopytech .C&NBSP;-O&NBSP;TESTWRITECOPYTECH&NBSP;&NBSP;
- $ ./testwritecopytech
- p = p , number = 1 4 -----The parent process to print the content
- $ p = c , number = 1 1 -----child process Print content
Cause Analysis:
Because of the part that attempts to write, the copy process occurs and the data is modified in the child process, and the kernel creates a new physical memory space. The data is then written to the new physical memory space again. It is known that the modification of the new area does not change the original area, so that different spaces are separated. However, areas that are not modified are still shared among multiple processes.
The code snippet of the fork () function is basically read-only, and is only copied at run time, and the content is not modified, so the parent-child process is a shared code segment, and the data segment, BSS segment, stack segment, and so on are written during the run, which results in the corresponding copy-on-write process for different segments. The independent space of different processes is realized.
However, it is important to note that the file operation, because the operation of the file is through the File descriptor table, File table, V-node tables three linked control, where the file table, V-node table is all process sharing, and each process has a separate file descriptor table. The contents of the parent-child process virtual storage space are roughly the same, and the parent-child process stores the file descriptor table through the same physical region, but if you modify the file descriptor, a copy-on-write operation occurs, which guarantees that the file descriptor changes in the child process do not affect the parent process's file descriptor tables. For example, the close action, because close causes the value of a file descriptor to change, is equivalent to a write operation, which results in a write-time copy process, implements a new physical space, and then a close operation occurs again, so that the file descriptor in the child process will not be closed, resulting in the parent process not accessing the file.
Test function:
[OBJC]View Plaincopy
- #include <stdio.h>
- #include <stdlib.h>
- #include <unistd.h>
- #include <sys/types.h>
- #include <sys/stat.h>
- #include <fcntl.h>
- #include <sys/wait.h>
- int main () {
- int fd;
- Char c[3];
- charchar *s = "Testfs";
- FD = open ("Foobar.txt", O_rdwr,0);
- if (fork () = =0) //Sub-process
- {
- FD = 1; StdOut
- Write (Fd,s,7);
- Exit (0);
- }
- //Parent Process
- Read (Fd,c,2);
- c[2]=' + ';
- printf ("c =%s\n", c);
- Exit (0);
- }
Compile run:
Shell Code
[OBJC]View Plaincopy
- $ gcc-g fileshare2. C-o fileshare2
- $./fileshare2
- c = fo----content in foobar. txt
- $ TESTFS---standard output
Cause analysis: Because the file descriptor table of the parent-child process is the same, but in the sub-process of FD (the item in the file descriptor list) is modified, then a copy of the process occurs, the kernel in physical memory allocation of a new page store sub-process original file Descriptor FD existence page content, and then further write operations, The implementation modifies FD to 1, which is the standard output. However, the FD of the parent process does not change, or the file descriptor is shared with other child processes, so the file foobar.txt is still operational.
It is therefore important to note that the fork () function is essentially a copy-on-write implementation of the file mapping, is not shared, the write-time copy operation makes the memory demand greatly reduced, the specific write-time copy implementation, please refer to the very classic "in-depth understanding of the computer system," the No. 622 page.
Linux process Management--fork () and write-time replication