Also talk about 0 copies (i) throw brick article

Source: Internet
Author: User

Reprint Address: http://blog.csdn.net/linuxdrivers/article/details/7487618

First, if the reader is unfamiliar with what is 0 copy, please refer to the link below:

0-Copy technology in Linux, part 1th

Http://www.ibm.com/developerworks/cn/linux/l-cn-zerocopy1/index.html

0-Copy technology in Linux, part 2nd

Http://www.ibm.com/developerworks/cn/linux/l-cn-zerocopy2/index.html

Effective data transfer through 0 copies

http://www.ibm.com/developerworks/cn/java/j-zerocopy/

The article in the link gives a good illustration of the concept and role of 0 copies (i.e. more efficient data transfer) and how to achieve 0 copies. Let's take a step-by-step look at what is the difference between the sending and receiving processes of data in general and the sending and receiving processes of data using 0 copy technology, to further understand the meaning and benefits of 0 copies.

I. Data transmission: Traditional methods

Consider the scenario of reading data from one disk file and transferring it to another program on the network (this scenario describes the behavior of many server applications, including Web applications that provide static content, FTP servers, mail servers, and so on).

Figure 1

The data transfer starts with the read disk file to the network card, and the whole process is described as follows:

Figure 2

The Step1:read () call raises the first context switch from user mode to kernel mode. Internally, a sys_read () (or equivalent) is emitted to read data from the file. The DMA engine performs its first copy/copy , which reads the contents of the files from disk and stores them in a kernel address space cache.

STEP2: The required data is copied from the read buffer to the user buffer, and the read () call returns. The return of the call raises the second context switch in kernel mode to user mode. This resulted in a second copy/copy , and the data is now stored in the user's address space buffer.

A step3:send () socket (or write () call) raises a third context switch from user mode to kernel mode. The data is copied /copied for the third time and placed again in the kernel address space buffer. However, this time the buffer is placed differently, and the buffer is associated with the target socket.

The Step4:send () call returns, resulting in a fourth-time context switch . The DMA engine uploads data from the kernel buffer to the protocol engine, and the fourth copy/copy occurs independently and asynchronously.

Figure 3

We can see very clearly that the data has been copied several times in the disk, intermediate kernel buffers, and user buffers, and has undergone multiple context switches. Obviously, the efficiency of such data transmission is very low.

While 0 copies improve performance by eliminating these redundant copies of data, let's take a step-by-step look at how the data is transmitted using the 0 copy:

second, data transmission: 0 Copy Method

From the beginning of the article to give a link, a brief introduction of the 0 copies of several implementation methods, we are here for the mmap implementation of the way to further analysis. When we look at the traditional data transfer, we notice that some copy operations are not possible. The application simply caches the data and passes it back to the socket for no other purpose. We can do the data transfer in the way shown in the following illustration:

Figure 4

    Note the difference between Figure 4 and Figure 1, in Figure 4, using mmap instead of the read in Figure 1. What difference does it make, and how do we achieve the above idea? In addition, using the intermediate kernel buffer, the read buffer in Figure 2 or the page cache in Figure 4, rather than directly transferring data to the user buffer, may seem a bit inefficient. But the purpose of introducing an intermediate kernel buffer is to improve performance. Using an intermediate kernel buffer for reading, you can allow kernel buffers to act as read-fast cache (ReadAhead cache) roles when the application does not require all of the data in the kernel buffer. This greatly improves performance when the amount of data required is less than the kernel buffer size. An intermediate buffer in writing allows the write process to complete asynchronously. Unfortunately, the method itself can be a performance bottleneck if the amount of data required is much larger than the kernel buffer size.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.