The Send of Linux system TCP protocol

Source: Internet
Author: User
Tags ack

The TCP protocol itself is reliable and does not equalIt must be reliable for applications to send data with TCP. Regardless of blocking, the size of send sends does not represent the amount of data recv to the end.

In blocking mode, the Send function is the process ofThe application requests a copy of the data sent to the send cache to be sent and confirmed before returning. However, due to the presence of the send cache, the Send function returns immediately if the sending cache size is larger than the size of the request sent, and theSend data in the network; otherwise, send sends to the network the part of the data that is not in the cache and waits for the peer to confirm and then return (the receiving end will confirm that it does not have to wait for the application to call recv) as long as it receives the data in the receive cache;

In non-blocking mode, the Send function is simply copying the data to the buffer of the protocol stack, if the buffer space is not available enough, then the ability to copy, return the size of the successful copy, if the buffer free space is 0, then return 1, and set errno to Eagain.


Sysctl-a Available under Linux | grep Net.ipv4.tcp_wmem ViewSystem default Send Cache Size:
Net.ipv4.tcp_wmem = 4096 16384 81920
There are three values, the first value is the minimum number of bytes allocated for the send buffer of the socket, the second value is the default (the value is overwritten by the Net.core.wmem_default), and the buffer can grow to this value if the system is not heavily loaded. The third value is the maximum number of bytes in the Send buffer space (this value is overwritten by Net.core.wmem_max).
According to the actualTest, if the value of Net.ipv4.tcp_wmem is changed manually, it will be run according to the changed value, otherwise the protocol stack is usually assigned by the value of Net.core.wmem_default and Net.core.wmem_max by default.of memory.

The application should change the send cache size in the program based on the characteristics of the app:

socklen_t Sendbuflen = 0;
socklen_t len = sizeof (Sendbuflen);
GetSockOpt (Clientsocket, Sol_socket, So_sndbuf, (void*) &sendbuflen, &len);
printf ("default,sendbuf:%d\n", Sendbuflen);

Sendbuflen = 10240;
SetSockOpt (Clientsocket, Sol_socket, So_sndbuf, (void*) &sendbuflen, Len);
GetSockOpt (Clientsocket, Sol_socket, So_sndbuf, (void*) &sendbuflen, &len);
printf ("now,sendbuf:%d\n", Sendbuflen);


It is important to note that although the send cache is set to 10k, in fact, the stack expands it by 1 time times and sets it to 20k.
-------------------Instance Analysis---------------

In real-world applications, if the sender is non-blocking, due to network congestion or slow processing of the receiver, it usually occurs that the sending application appears to have sent 10k of data, but only the 2k to the peer cache, and 8k in the native cache (not sent or not received by the receiving end of the acknowledgment). So at this point, , the receiving application is able to receive 2k of data. If the receiving application calls the RECV function to get the 1k of data in the processing, in this instant, one of the following situations occurred, the two parties behaved as:

A. The sending application thinks that the send finished 10k data, closed the socket:
Sending the host as the active shutdown of TCP, the connection will be in the semi-shutdown state of fin_wait1 (waiting for the ACK of the other party), and the 8k data in the Send cache is not cleared and will still be sent to the peer. If the receiving application is still in recv, then it will receive the remaining 8k data (the previous question is, The receiving end receives the remaining 8k data before the send-side fin_wait1 state times out.), and then gets a message that the peer socket is closed (recv returns 0). At this point, it should be closed.

B. The sending application calls send again for 8k data:
If the sending cache space is 20k, then the send cache free space is 20-8=12k, greater than the request sent 8k, so the Send function copies the data, and immediately returns 8192;

If the sending cache space is 12k, then the send cache free space and 12-8=4k,send () will return 4096, the application found that the value returned is less than the requested size value, you can think the buffer is full, it must be blocked ( or by Select to wait for the next socket writable signal), if the application ignores, call send again immediately, then will get 1 value, under Linux performance as Errno=eagain.

C. The receiving application closes the socket after processing the 1k data:
The receiving host, as the active shutdown, will be in the semi-closed state of the fin_wait1 (waiting for the ACK of the other party). Then, the sending application receives a socket-readable signal (usually the select call returns a socket-readable), but when read, it is found that the Recv function returns 0, You should then call the close function to close the socket (send the ACK to the other);

If the sending application does not process this readable signal, but is in send, then this is considered in two cases, if it is called after the RST flag is received by the sender Send,send will return-1, while errno is set to Econnreset to indicate that the peer network is disconnected, however, It is also said that the process will receive a sigpipe signal, the default response action of the signal is the exit process, if the signal is ignored, then send is returned -1,errno to Epipe (not confirmed);

The above-mentioned is a non-blocking send case, if send is blocking the call, and just at the time of blocking (such as sending a huge buf, beyond the Send cache), the end socket is closed, then send will return the number of bytes sent successfully, if you call send again, then the same .

D. Network disconnection of the switch or router:
After the receiving application finishes processing the received 1k data, it continues to read the remaining 1k data from the buffer and then behaves as if it were a myriad of read-only scenarios, which requires the application to handle timeouts. The general practice is to set the maximum number of select Waitstime, if there is still no data to read beyond this time, the socket is considered unusable.

The sending application continues to send the rest of the data to the network, but it is not always confirmed, so the space available for the buffer continues to be 0, which is also required for application processing.

If the case is not handled by the application, it can also be handled by the TCP protocol itself, which can be viewed in the Sysctl key:
Net.ipv4.tcp_keepalive_intvl
Net.ipv4.tcp_keepalive_probes
Net.ipv4.tcp_keepalive_time

Send (GO) of the Linux system TCP protocol

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.