Implementation of tcp ack and unordered packet temporary storage in Linux-instant ack/delay ack/latency with Ack

Source: Internet
Author: User

TCP requires ACK, but for efficiency, instead of waiting for ACK every time a piece of data is sent, it tries its best to use the window mechanism to accumulate ack sending, of course, in some special circumstances, Ack still needs to be sent immediately, for example, when unordered data is received, although the receiving end can temporarily store unordered data packets, however, the receiver must send an ACK with the expected serial number in order to the sender. In addition, the receiver must adjust the number in the Receiving Window and send the ACK immediately. Otherwise, the ACK can be delayed, let's take a look at the Linux Code in this regard:
Static void _ tcp_ack_snd_check (struct sock * SK, int ofo_possible)
Struct tcp_sock * TP = tcp_sk (SK );
// Rcv_mss is the end-to-end MSS for estimation. It also has great significance for calculating the current receiving window.
If (TP-> rcv_nxt-TP-> rcv_wup)> inet_csk (SK)-> icsk_ack.rcv_mss // if more than one packet is received, an ACK is sent, confirm two packets at once
& _ Tcp_select_window (SK)> = TP-> rcv_wnd) | // you need to adjust the window to maximize the throughput.
Tcp_in_quickack_mode (SK) |
(Ofo_possible & // receives out-of-order packets
Skb_peek (& TP-> out_of_order_queue ))){
Tcp_send_ack (SK );
} Else {
Tcp_send_delayed_ack (SK );
In general, the above functions implement the ACK suggestions for rfc1122 and rfc2581.
Doesn't it mean that ack can be sent with the producer when sending data? Indeed, Ack is sent whenever data is sent, but data is sent at the application layer. What if the application layer does not send the data? Wouldn't it be impossible to ack? Therefore, there must also be a mechanism of the transport layer to support ack sending. Ack is just a supplement, and ACK sending on the transport layer is the instant mode and delay mode, as shown in _ tcp_ack_snd_check above.
Sending Ack is actually very simple. It is to fill in a TCP data. The ack field is set to add 1 to the serial number of the data on the far left of the receiving window. delayed sending is not afraid of repeating the sending with the rst, rfc2581 requires that only one ack can be generated for each incoming packet, and redundant ack can be sent only after the sender retransmits the packet. If TCP enters the status of waiting for the delayed ack to be sent, when the receiving end has data to be sent, it will bring the ACK timer to the sending end, and clear the pendding of the delayed ack timer, so that the ACK timer will not be sent after it expires.
In an ideal stable situation, the receiving end's window is also stable and does not need to be adjusted. If the receiving end does not send data but only receives data, almost all Ack is sent to the sending end in a delayed manner, if the acceptor sends data at the same time, the ACK will be sent to the sender in the method provided by the receiver. The ack will be sent only in several special exceptions.
One is to receive more than one complete TCP segment and possibly enlarge the window. In order to maximize the throughput, the enlarged window must not be wasted. Therefore, an ACK must be sent to the sender immediately, after receiving the ACK, the sender will continue to send the data following the sending window. The MSS at the sending end is associated with the size of the window at the receiving end. The window at the receiving end is set to an integer multiple of the MSS at the sending end, which leads to the highest memory utilization, after determining the acceptable window size of the receiver, if it is larger than the current window, immediately send ACK so that the sender can send data as soon as possible.
The other is in the so-called quick mode. The quick mode is not often used. Only non-interactive TCP connections can enter the quick mode, because the interactive connection indicates that ack is fast enough and there is no need to send ACK immediately. Generally, Ack is sent with ACK or delayed. How can we determine whether it is an interactive connection? In the kernel, The tcp_opt struct contains an ACK sub-struct, which has two internal fields: quick and pingpong. pingpong is used to determine the interactive connection. The kernel will make a choice in many places based on many parameters, for example, the sending and receiving interval or user configuration determines whether a connection is interactive. If not, there are a series of problems: 1. because the user process does not retrieve the received data for a long time and causes a series of problems, the protocol stack needs to instantly reply ACK, 2. the overstocked ack does not reply, which affects the sending rate of the sender. At this time, a certain number will be assigned to quick. Each time an ACK is sent, some quick values will be consumed until quick is used up to enter the delay mode. The Quick value is related to the window, because the receiving end can only check the size of the received window.
The last possibility of ack immediately is to receive out-of-order packets, indicating that the data may have been lost. In this case, the recovery phase should be started as soon as possible, that is, fast retransmission should be performed as soon as possible, at this time, Ack should also be sent immediately (the kernel will call tcp_send_dupack to send an ACK packet and then return it if it finds out that the out-of-bounds package [is different from the out-of-order package ), after the kernel receives the out-of-order packet, it will cache the out-of-order packet in the out_of_order_queue queue, and finally call tcp_ack_snd_check to send an ACK again. This ack confirms the last in-order packet, this ack should have been sent before, so that the receiving end will send a redundant ack after receiving the out-of-order message. If the data received next time is still out-of-order, then, send the first two identical ack messages, so that the sender may receive three identical ack messages at the receiver, when the third packet received is still out of order, the redundant Ack is sent again. Only the 4th Ack is received by the sender before it can be retransmitted quickly. One of the details here is that the sender receives four identical ack (three redundant ACK), which serves as a marker for fast retransmission. Linux implements this, it complies with the RFC recommendations, but this implementation relies on the idea behind it.
One message is not ordered. At least two packets have the concept of order. Just like in the byte order, utf8 uses one byte as the encoding unit, so there is no issue of byte order. Similarly, only one packet comes, so it cannot be said that it is out of order for the current ordered packets. Only when the second packet arrives, if the current ordered packet, the first packet, the second packet cannot be sorted in order, which means that the two packets are disordered. Of course, this is also a trade-off, just as why the three-way handshake is three times, even if the receiving end receives the third out-of-order message, it is still possible to be filled by the Fourth and become a forward message. It is impossible to wait for the message to arrive at the sending end, A large number of redundant ACK packets cannot be retransmitted quickly, so three redundant ACK packets are selected. Of course, this number can be configured.
Finally, let's take a look at the re-adjustment of unordered packets. In the Linux protocol stack implementation, unordered packets are inserted into a queue in sequence according to the serial number. This queue is based on connections. If there are temporary packets in this unordered queue, every time a packet is received, it will attempt to call the tcp_ofo_queue function. Its significance lies in its efforts to make out the order of unordered packets, just as the ideas behind the redundant ack mentioned above, each new message may fill the gap between ordered messages and unordered messages. In other words, each new message may be directly spliced to the end of the last message queue, at the same time, it is possible to splice one or more messages at the beginning of the unordered queue to form a series of ordered packets:
Static void tcp_ofo_queue (struct sock * SK)
Struct tcp_opt * TP = tcp_sk (SK );
_ U32 dsack_high = TP-> rcv_nxt;
Struct sk_buff * SKB;
While (SKB = skb_peek (& TP-> out_of_order_queue ))! = NULL ){
If (after (tcp_skb_cb (SKB)-> seq, TP-> rcv_nxt) // The first SKB cannot be combined.
If (! After (tcp_skb_cb (SKB)-> end_seq, TP-> rcv_nxt )){
_ Skb_unlink (SKB, SKB-> list); // The packets that have been received, continue
_ Kfree_skb (SKB );
_ Skb_unlink (SKB, SKB-> list); // You Can concatenate and update the rcv_next field of TP.
_ Skb_queue_tail (& SK-> sk_receive_queue, SKB );
TP-> rcv_nxt = tcp_skb_cb (SKB)-> end_seq;

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.