Copy-on-write)

Source: Internet
Author: User

Source from online materials

COW Technology:

In Linux, fork () generates a child process that is exactly the same as the parent process, but the child process will be called by the exec system later. For efficiency considerations, in linux, the "Copy at write time" technology is introduced, that is, only when the content of each segment of the process space changes will the content of the parent process be copied to the child process.

So there is no code in the physical space of the sub-process. How can I get the command to execute the exec system call?

Before the exec process after fork, the two processes use the same physical space (memory zone). The Code segment, data segment, and stack of the child process all point to the physical space of the parent process. That is to say, the two have different virtual spaces, but their physical spaces are the same. When the Parent and Child processes change the corresponding segments, allocate physical space for the corresponding segments of the Child processes. If it is not because of exec, the kernel allocates physical space to the data and stack segments of the sub-process (the two have their own process space, which does not affect each other ), the code segment continues to share the physical space of the parent process (the Code of the two is the same ). If it is because of exec, because the code executed by the two processes is different, the child process code segment will also be allocated a separate physical space.

After fork, the kernel puts the sub-process in front of the queue to allow the sub-process to be executed first, so as to avoid writing replication caused by the execution of the parent process, then the sub-process executes the exec System Call, resulting in a reduction in efficiency due to meaningless replication.

 

COW details:

Now there is a parent process P1, which is a subject, so it has a soul and a body. Now, in its virtual address space (with corresponding Data Structure Representation), there are four parts: Body segment, data segment, heap, and stack, the kernel allocates physical blocks for these four parts. That is, text block, data block, heap block, and stack block. As for how to allocate resources, this is what the kernel does, which is not detailed here.

1. Now P1 uses the fork () function to create a sub-process P2 for the process,

Kernel:

(1) copy the text segment, data segment, heap, and stack of P1. Note that the content is the same.

(2) Allocate physical blocks for these four parts. Physical blocks of P2: body segments> PI body segments are actually not allocated to P2, let the P2 Text Segment point to the P1 text block, data segment-> P2's own data segment block (assign the corresponding block for it), heap-> P2's own heap block, stack-> P2 stack block. As shown in: the arrow from left to right indicates copying content.

 

2. write-time replication technology: the kernel only creates virtual space structures for newly generated sub-processes. They are copied to the virtual structure of the parent process, but do not allocate physical memory for these segments, they share the physical space of the parent process. When the Parent and Child processes change the corresponding segments, they allocate physical space for the corresponding segments of the child process.

 

 

3. vfork (): this practice is even more popular. The virtual address space structure of the kernel connection sub-process is not created, and the virtual space of the parent process is directly shared. Of course, in this way, the physical space of the parent process is shared.

 

Through the above analysis, I believe that everyone has a deep understanding of the process. How does it reflect itself in layers? If the process is a subject, then it has a soul and body, the system must create corresponding entities, soul entities and physical entities for its implementation. Both have corresponding data structures in the system, and physical entities represent their physical meanings. LKD

The traditional fork () system calls directly copy all the resources to the newly created process. This implementation is too simple and inefficient because the data it copies may not be shared. Worse, if a new process intends to execute a new image immediately, all copies are discarded. In Linux, fork () uses the copy-on-write page. Copy at write time is a technology that can delay or even avoid copying data. At this time, the kernel does not copy the whole process address space, but shares the same copy with the child process. Data is copied only when data needs to be written, so that each process has its own copy. That is to say, resource replication is only performed when the data needs to be written. Before that, the data is only shared in read-only mode. This technology delays the copy of pages in the address space until the actual writing occurs. When pages are not written at all-for example, fork () Immediately calls exec ()-they do not need to be copied. The actual overhead of fork () is to copy the page table of the parent process and create a unique process descriptor for the child process. Generally, an executable file is run immediately after a process is created, this optimization avoids copying a large amount of data that is not used at all (the address space usually contains dozens of megabytes of data ). Since Unix emphasizes the fast execution of processes, this optimization is very important. Here, we will add one point:Linux COW and exec are not necessarily related

 

 

PS: in fact, COW technology is not only used in Linux processes, but other types of C ++ strings also support COW Technology in some IDE environments, for example:

string str1 = "hello world";string str2 = str1;

Then run the Code:

str1[1]='q';str2[1]='w';

After the first two statements, the addresses of str1 and str2 to store data are the same. After the modified content, the address of str1 has changed, while the address of str2 is still the original one, this is the application of COW Technology in C ++, but VS2005 does not seem to support COW.

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.