A very important field in the header of TCP is the size of a 16-bit long window. It appears in each TCP datagram and matches the 32-bit validation serial number, it is used to notify the peer about the size of the Receiving Window of the local socket. That is to say, if the local socket sends a TCP data, the 32-bit validation serial number is 5, and the window size is 5840, it is used to tell the peer end, the four bytes of data sent from the peer end have been received and confirmed. Next, the local socket can receive up to 5th bytes of data starting from 5840 bytes. This is a type of traffic control implemented by the receiver. the receiver can control the sending speed by telling the sender the size of the data that the sender can receive.
The structure struct tcp_sock has a lot of member data related to the Sliding Window Protocol. Note that the sliding window here refers to the receiving window of the local socket.
The member window_clamp indicates the maximum value of the sliding window. The size of the sliding window cannot exceed this value during the change process. It is initialized when the TCP connection is established, and is set to the largest 16-bit integer left shift window expansion factor, because the sliding window is represented in 16 bits in the TCP header, window_clamp is too large, so the sliding window cannot be expressed in the TCP header.
The member rx_opt is a struct tcp_options_received struct. It has two members: snd_wscale and rcv_wscale, which indicate the Sliding Window expansion factor advertised by the peer device respectively (which must be observed when data reporting is sent locally ), and the expansion factor of the local receive sliding window. Snd_wscale is obtained from the first SYN from the peer end. Rcv_wscale is initialized when a connection is established on the local socket. The principle of its value assignment is to shift the maximum value of a 16-bit integer to the left of rcv_wscale, which can at least reach the maximum value of the entire receiving cache. The maximum value of the received cache is represented by the global variable mysysctl_rmem_max in the protocol stack. It is the value after 256*(256 + sizeof (struct sk_buff), which is 107520, however, sysctl_tcp_rmem [3] indicates that the upper limit of the receiving cache is greater, which is 174760. Therefore, if the latter is used, the rcv_wscale value is almost fixed, which is 2. Therefore, the value of window_clamp is 65535 <2 = 262140. It can be seen that the value of window_clamp exceeds the maximum value of the received cache, but this does not matter, because the size of the received cache will be considered when the Sliding Window grows.
Rcv_wnd indicates the size of the current receiving window. This value changes after receiving data from the peer end. Its initial value is the minimum value between 3/4 of the received cache size and max_tcp_window. max_tcp_window is defined as 32767u in the system. Then, make an adjustment based on the MSS value. The adjustment logic is: if the MSS is greater than 3*1460, if the current rcv_wnd is greater than twice the MSS, take twice the MSS as the rcv_wnd value; if the MSS is greater than 1460, if the current rcv_wnd is greater than 3 times the MSS, take three times the MSS as the new rcv_wnd value; otherwise, if rcv_wnd is greater than 4 times the MSS, take 4 times the MSS as the new value of rcv_wnd, in our experiment, the MSS value is 1448 (because the TCP Header has a 12-byte timestamp option), so rcv_wnd is adjusted to 1448*4 = 5792.
Rcv_ssthresh is a threshold value of the current receiving window size, and its initial value is rcv_wnd. It works with rcv_wnd. When the local socket receives the datagram and meets certain conditions, the value of rcv_ssthresh is increased. When the next packet is sent to form the TCP Header, to notify the current Receiving Window Size of the peer, update rcv_wnd. The rcv_wnd value cannot exceed the rcv_ssthresh value. The two work together to achieve a slow growth of sliding window size.
Rcv_wup records the left edge of the sliding window, that is, the smallest serial number that falls into the sliding window. In this case, rcv_wup + rcv_wnd is the right side of the sliding window, rcv_wup + rcv_wnd-rcv_nxt is the blank part of the sliding window. Its initial value is 0 and is updated when the sliding window is moved. The above are some data about receiving sliding windows. Let's take a look at how they are used in TCP communication.
Every time a TCP datagram is sent, the TCP header is constructed. At this time, mytcp_select_window is called to select the window size. The basic idea of window size selection is to receive 3/4 of the remaining cache space, but it cannot exceed the rcv_ssthresh size. However, if the size of the selected window is smaller than the remaining size of the current window, the remaining size of the current window is used as the size of the new window. At the same time, move the left side to rcv_wup = rcv_nxt. This new window is restricted by rcv_ssthresh. Generally, there is no problem, but we can see that some upper limit judgment is made in the Code. If the expansion factor is 0, the window size cannot exceed 32767u. Otherwise, the window size cannot exceed the value after 65535 shifted to the extended factor.
Every time we receive a TCP datagram from the peer end and the datagram length is greater than 128 bytes, we need to call mytcp_grow_window to increase the rcv_ssthresh value. Generally, Every time rcv_ssthresh increases by two times, the mss, the added condition is that rcv_ssthresh is smaller than window_clamp, and rcv_ssthresh is less than 3/4 of the remaining space in the received cache, and mytcp_memory_pressure is not set to a bit (that is, the data volume in the received Cache ). Mytcp_grow_window has some limitations on the length of the newly received skb, and does not always increase the value of rcv_ssthresh. For details, see the function code.
The above is about the receiving window. Let's take a look at the sending window below. Some member data is also related to the sending window in struct tcp_sock.
Snd_wl1 records the first serial number of the ACK datagram updated in the window when the sending window is updated. It is mainly used to determine whether to update the sending window next time.
Snd_wnd is the size of the sending window, which is directly taken from the TCP Header of the Peer datagram.
Max_window records the maximum value of the window from the end-to-end announcement.
Snd_una indicates the first serial number currently waiting for ACK, And the sending window is actually updated every time the ACK from the peer is received. Therefore, snd_una is actually the left edge of the sending window.