Tcp_nodelay and tcp_cork in Network Programming

Source: Internet
Author: User
Tags sendfile

Tcp_nodelay and tcp_cork,
Both options play an important role in network connection. Many Unix systems have implemented the tcp_nodelay option. However, tcp_cork is unique to Linux systems and is relatively new. It is first implemented in kernel version 2.4. In addition, other UNIX system versions have similar functions. It is worth noting that the tcp_nopush option on a BSD-derived system is actually part of the specific implementation of tcp_cork.
Tcp_nodelay and tcp_cork basically control the "Nagle" of the package. The meaning of Nagle here is that the Nagle algorithm is used to assemble a smaller package into a larger frame. John Nagle is the inventor of the Nagle algorithm. The latter is named by his name, he used this method for the first time in 1984 to solve the network congestion problem of Ford Motor Corporation (For details, refer to ietf rfc 896 ). The problem he solved is the so-called silly window syndrome, which is called the "stupid window syndrome" in Chinese. The specific meaning is that every time a universal Terminal application generates a key operation, it will send a packet, in typical cases, a packet has a data load of one byte and a 40-byte long packet header, resulting in 4000% overload, which can easily cause network congestion ,. Nagle became a standard and was immediately implemented on the Internet. It has now become the default configuration, but in our opinion, it is also necessary to turn this option off in some cases.
Now let's assume that an application sends a request to send small pieces of data. We can choose to send data immediately or wait for more data to be generated and then send it again. If we send data immediately, our interactive and customer/server applications will be greatly benefited. For example, when we are sending a short request and waiting for a large response, the ratio of the relevant overload to the total amount of data transmitted will be relatively low, and, if the request is sent immediately, the response time will be faster. You can set the tcp_nodelay option of the socket to disable the Nagle algorithm.
In another case, we need to wait until the data size reaches the maximum to send all the data through the network. This data transmission method is beneficial to the communication performance of a large amount of data. A typical application is the file server. The application of the Nagle algorithm causes problems in this case. However, if you are sending a large amount of data, you can set the tcp_cork option to disable Nagle. The method is exactly the same as that of tcp_nodelay (tcp_cork and tcp_nodelay are mutually exclusive ). Next let's take a closer look at its working principles.
Assume that the application uses the sendfile () function to transfer a large amount of data. Application Protocols usually require sending certain information to pre-interpret the data, which is actually the header content. In typical cases, the header is small and tcp_nodelay is set on the socket. Packets with headers will be transmitted immediately. In some cases (depending on the internal package counter), this packet is successfully received by the other party and needs to be confirmed by the other party. In this way, the transmission of a large amount of data will be postponed and unnecessary network traffic exchange will occur.
However, if we set the tcp_cork option on the socket (which may be equivalent to inserting a "plug-in" on the pipeline), a packet with a header will fill in a large amount of data, all data is automatically transmitted through the package according to the size. When the data transmission is complete, it is best to cancel the tcp_cork option setting to "Remove the plug" for the connection so that any part of the frames can be sent out. This is equally important for "congested" network connections.
All in all, if you can certainly send multiple data sets together (such as the HTTP response header and body), we recommend that you set the tcp_cork option so that there is no latency between the data. It can greatly benefit the performance of WWW, FTP, and file servers, while also simplifying your work. The sample code is as follows:
Intfd, on = 1;
...
/* Create socket and other operations, which are omitted for space purposes */
...
Setsockopt (FD, sol_tcp, tcp_cork, & on, sizeof (on);/* cork */
Write (FD ,...);
Fprintf (FD ,...);
Sendfile (FD ,...);
Write (FD ,...);
Sendfile (FD ,...);
...
On = 0;
Setsockopt (FD, sol_tcp, tcp_cork, & on, sizeof (on);/* unplug the plug */
Unfortunately, many common programs do not consider the above issues. For example, Sendmail written by Eric Allman does not set any options for its socket.
Apache httpd is the most popular Web server on the Internet. All its sockets are configured with the tcp_nodelay option, and its performance is well received by most users. Why? The answer lies in the difference in implementation. The TCP/IP protocol stack derived from BSD (FreeBSD is worth noting) has different operations in this situation. When a large number of small data blocks are submitted for transmission in tcp_nodelay mode, a large amount of information is sent by calling the write () function once. However, because the record responsible for request delivery validation is byte-oriented rather than packet-oriented (on Linux), the probability of latency introduction is much lower. The result is only related to the size of all data. Linux requires confirmation after the first package arrives, and FreeBSD will wait for several hundred packages before doing so.
In Linux, the effect of tcp_nodelay is very different from that expected by developers who are used to the BSD TCP/IP protocol stack, and the Apache performance in Linux will be worse. Other applications that frequently use tcp_nodelay on Linux have the same problem.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.