TCP and UDP send and receive when TCP has a buffer or UDP buffer, when using them to pay attention to what?

Source: Internet
Author: User
Tags new set

Question: When TCP and UDP transmit and receive TCP buffers or UDP buffers, what should I pay attention to when using them?

(i) Basic

1, TCP is a reliable link, three times the handshake four release.

2, UDP is unreliable link

(ii) TCP with the UDP the output
Each TCP socket interface has a send buffer, which can be changed by using the SO_SNDBUF socket interface option. When the application process calls write data to the socket, the kernel copies all data from the application process buffer to the send buffer of the socket, if the socket send buffer does not contain all the data of the application, or if the buffer of the application process is larger than the send buffer of the socket. Or if there is another data in the send buffer for the socket, the application process will be suspended. The kernel will not return from write. Until all data in the application process buffer is copied to the socket send buffer. So, the successful return from writing a TCP socket to a write call simply means that we can reuse the application process buffer, and it does not tell us that the other party received the data. TCP sends the data to the other party, the other party must confirm the spear when it receives the data, and only when it receives the confirmation from the other side, TCP will remove the data from the TCP send buffer.
UDP because it is an unreliable connection, do not have to save the data copy of the application process, the application process of data in the protocol stack down, in some form of copy to the kernel buffer, when the data link layer of data out of the kernel buffer in the data copy deleted. Therefore, it does not require a send buffer. Writing the write return of the UDP socket indicates that the data or data shard of the application has entered the output queue of the link layer, and if the output queue does not have enough space to hold the data, an error ENOBUFS will be returned.

(iii) TCP Sockets the send and receive buffers   

  The application can send application data to the network by calling send (write, sendmsg, etc.), and the TCP/IP protocol stack sends the application data (TCP datagram) that has been organized into a struct sk_buff to the network via the network device interface. Because the speed at which the application calls send is different from the speed at which the data is sent by the network media, some of the application data is organized into a TCP datagram and is cached in the Send cache queue of the TCP socket, waiting for the network to be sent out when it is idle. At the same time, the TCP protocol requires the peer to ACK the serial number after receiving the TCP datagram, and only after receiving the ACK of a TCP datagram can the TCP datagram (in the form of a struct sk_buff) be purged from the socket's send buffer queue.
The send buffer for a TCP socket is actually a struct struct sk_buff queue, which we can call the Send buffer queue, represented by the member sk_write_queue of struct struct sock. Sk_write_queue is a struct struct sk_buff_head type, which is a two-way linked list of struct Sk_buff, which is defined as follows:


struct Sk_buff_head {
struct Sk_buff *next; Back pointer
struct Sk_buff *prev; Front pointer
__u32 Qlen; Queue Length (that is, contains several struct sk_buff)
spinlock_t lock; Chain List Lock
};

(1)

In the kernel code, a struct sk_buff sufficient to hold the data is created in this queue, and then the application data is deposited into the queue.
The member sk_wmem_queued of the struct struct sock represents the number of bytes allocated in the Send buffer queue, in general, assigning a struct sk_buff is used to hold a TCP datagram, and its allocated bytes should be the Mss+ protocol first ministerial level. In my experimental environment, the MSS value is 1448, the protocol header takes the maximum length max_tcp_header, in my experimental environment is 224. After the data is aligned, the truesize of the last struct Sk_buff is 1956. That is, each allocation of a struct sk_buff in the queue, the value of member Sk_wmem_queue increases by 1956.
The member sk_forward_alloc of the struct sock represents the pre-allocated length. When we first allocated a struct sk_buff for the send buffer queue, we did not directly allocate the required memory size, but rather the pre-allocation in memory pages.
The TCP protocol assigns a struct sk_buff function to be sk_stream_alloc_pskb. It first allocates a struct sk_buff in memory based on the size specified by the passed in parameter, and if successful, Sk_forward_alloc takes that size value and takes an integer multiple of the page (4096 bytes) up. And added to the member Sk_prot of the struct sock, that is, the member memory_allocated of the struct Mytcp_prot that represents the TCP protocol, which is a pointer to the variable tcp_memory_allocated, It represents the current memory allocated by the entire TCP protocol to the buffer (including the read buffer queue)
When the newly allocated struct sk_buff is placed into the buffer queue Sk_write_queue, the truesize value of the Sk_buff is subtracted from the sk_forward_alloc. The second time the struct Sk_buff is allocated, the truesize of the new Sk_buff is subtracted from the Sk_forward_alloc, if Sk_forward_alloc is less than the current truesize, It adds an integer multiplier to the page and joins Tcp_memory_allocated.
That is, by sk_forward_alloc the global variable tcp_memory_allocated saves the total buffer allocated memory for the current TCP protocol, and the size is aligned to the page boundary.
(2)

As mentioned earlier, member Sk_forward_alloc of struct sock represents the pre-allocated memory size used to mytcp_memory_allocated the buffer size of the entire TCP protocol that is currently allocated to the global variable. The reason to accumulate this value is to limit the total available buffer size for the TCP protocol. The struct body Mytcp_prot that represents the TCP protocol also has several members associated with the buffer.
Mysysctl_tcp_mem is an array that is pointed to by the member Sysctl_mem of the Mytcp_prot, the array has three elements, and mysysctl_tcp_mem[0] represents the minimum limit for the total available size of the buffer, The current total allocated buffer size is below this value, then there is no problem and the allocation is successful. Mysysctl_tcp_mem[2] represents the highest hard limit on the size of the buffer available, and once the total allocated buffer size exceeds this value, we have to reduce the default size of the send buffer of the TCP socket SK_SNDBUF to half the size of the allocated buffer queue. But not less than sock_min_sndbuf (2K), but ensure that this time the allocation succeeds. MYSYSCTL_TCP_MEM[1] In the middle of the previous two values, this is a warning value, once the value is exceeded, enter the warning state, in this state, depending on the invocation parameters to determine whether the allocation is successful.
The size of these three values is based on the size of the system's memory, determined at initialization, in my lab environment, the memory size is 256M, these three values are assigned: 96k,128k,192k. They can be modified in the/PROC/SYS/NET/IPV4/TCP_MEM through the/proc file system. Of course, you don't need to change these defaults unless you specifically want to.
Mysysctl_tcp_wmem is also an array of the same structure, indicating the size limit of the send buffer, which is pointed by the member Sysctl_wmem of Mytcp_prot, and whose default value is 4k,16k,128k. You can modify it in/proc/sys/net/ipv4/tcp_wmem by/proc the file system. The value of the member sk_sndbuf of the struct sock is the default size of the true send buffer queue, and its initial value is taken in the middle of a 16K. During the sending of the TCP datagram, once the sk_wmem_queued exceeds the SK_SNDBUF value, the send stops and waits for the send buffer to be available. Because it is possible that a batch of sent data has not received an ACK, while the buffer queue of data can be all sent out, has reached the purpose of emptying the buffer queue, so long as the network is not very poor (poor to no way to receive an ACK), this wait for a period of time will be successful.
The global variable mytcp_memory_pressure is a flag that, when the TCP buffer size enters the warning state, it is set to 1, otherwise 0.

(3)

Mytcp_sockets_allocated is the number of sockets created so far by the entire TCP protocol, which is pointed to by Mytcp_prot member sockets_allocated. Can be viewed in the/proc/net/sockstat file, which is just a statistical view of the data, without any practical restrictions.
Mytcp_orphan_count represents the number of sockets to be destroyed in the entire TCP protocol (a useless socket), which is pointed to by the Mytcp_prot member Orphan_count, or can be viewed in the/proc/net/sockstat file.
Mysysctl_tcp_rmem is an array of the same structure as MYSYSCTL_TCP_WMEM, which represents the size limit of the receive buffer, which is pointed by the member Sysctl_rmem of Mytcp_prot, with a default value of 4096bytes respectively. 87380bytes,174760bytes. They can be modified in the/PROC/SYS/NET/IPV4/TCP_RMEM through the/proc file system. The member sk_rcvbuf of the struct sock represents the size of the receiving buffer queue, its initial value is mysysctl_tcp_rmem[1], the member Sk_receive_queue is the receive buffer queue, and the structure is the same as Sk_write_queue.
The sending buffer queue of the TCP socket and the size of the receiving buffer queue can be modified either through the/proc file system or through the TCP option operation. The option SO_RCVBUF at the socket level can be used to get and modify the size of the receive buffer queue (that is, the value of the Strcut sock->sk_rcvbuf), such as the following code can be used to obtain the current system's receive buffer queue size:

int Rcvbuf_len;
int len = sizeof (Rcvbuf_len);
if (getsockopt (FD, Sol_socket, SO_RCVBUF, (void *) &rcvbuf_len, &len) < 0) {
Perror ("getsockopt:");
return-1;
}
printf ("The Recevice buf len:%d\n", Rcvbuf_len);

The option SO_SNDBUF on the socket level is used to get and modify the size of the send buffer queue (that is, the value of the struct sock->sk_sndbuf), as in the code above, just change so_rcvbuf to So_sndbuf.
It is relatively simple to get the size of the send and receive buffers, and the actions set up in the kernel will be slightly more complex, in addition, there will be differences on the interface, that is, the parameter that is passed by setsockopt represents the buffer size is 1/2 of the actual size, that is, if you want to set the send buffer size to 20K, You need to call setsockopt like this:

int Rcvbuf_len = 10 * 1024; Half of the actual buffer size.
int len = sizeof (Rcvbuf_len);
if (setsockopt (FD, Sol_socket, SO_SNDBUF, (void *) &rcvbuf_len, Len) < 0) {
Perror ("getsockopt:");
return-1;
}

In the kernel, the kernel first to determine whether the new set of values exceed the upper limit, if exceeded, the upper limit is the new value, send and receive buffer size of the upper limit of Sysctl_wmem_max and Sysctl_rmem_max, respectively, twice times. The values of the two global variables are equal, both (sizeof (struct Sk_buff) + 256) * 256, approximately 64K load data, and the actual send and receive buffers can be set to about 210K, due to the influence of the struct sk_buff. Their lower limit is 2K, which means that the buffer size cannot be less than 2K.
In addition, SO_SNDBUF and SO_RCVBUF have a special version: So_sndbufforce and So_rcvbufforce, which are not limited by the maximum send and receive buffer size, and can be set to any buffer size not less than 2K

It can also be explained by the legend:

Concept:

MTU: The maximum value of the data in the data frame on the link layer, which is the entire value of the IP datagram. See TCP/IP page 7th. The encapsulation process of the data into the protocol stack.

The maximum value of the data in the MSS:TCP message segment---MSS option can only appear in the SYN message.


TCP Output:

Each TCP socket interface has a send buffer, and we can change the size of the buffer with the SO_SNDBUF socket option. When the application calls write, the kernel copies all the data from the buffer of the application process to the send buffer of the socket. If the socket send buffer does not contain all of the application's data (or if the application process's buffer is larger than the socket send buffer, or if there is additional data for the socket send buffer), the application process will be suspended, assuming that write is blocked. The kernel will not return from the write system call until all data for the application process buffer is copied to the socket send buffer. So the write call from writing a TCP socket successfully returns a buffer that simply represents our reuse of the application process. He does not tell us that the peer TCP or the application process has received the data.

UDP output:

This time we show a set of interface send buffers represented by a dashed box because it does not exist. The UDP socket interface has a Send buffer size (so_sndbuf modified), but it is only the upper limit of the size of the UDP datagram written to the socket interface. If the application writes a datagram that is larger than the set interface send buffer size, the kernel returns a emsgsize error. Since UDP is unreliable, he does not have to save the data copy of the application process, so there is no need for a real send buffer (the application process's data is passed down the protocol stack and copied to the kernel buffer in some form, but the data link layer discards the copy after sending the data).

According to the discovery, UDP does not have the MSS concept, if a UDP application sends big data, then he is more likely to shard than the TCP application. The write successfully returned from the UDP socket interface simply indicates that the user writes the datagram or all fragments have been added to the output queue of the data link layer. If the queue does not have enough space to hold the datagram or one of his fragments, the kernel usually returns a ENOBUFS error to the application process (and some systems do not return an error).

Both TCP and UDP have a set of interface receive buffers. The TCP socket receive buffer cannot overflow because TCP has traffic control (Windows). However, for TCP, when the received datagram is not loaded into the socket receive buffer, the datagram is discarded. UDP is no traffic control: the faster the sender can easily drown the slower receiver, resulting in UDP drop datagram at the receiving end.

We can use the program to verify this:

#define NDG#define Dglen 1400client () {for  (int i=0;i<ndg;i++) sendto ( Sockfd,sendline,dglen,0,pservadd,servlen);}        
The client quickly sends the big datagram, and we find a lot of drops on the receiving end of a slow host (FreeBSD). UDP socket receive buffer under FreeBSD the default is 42080 bytes, which is 30*1400 bytes of space. If we increase the receive buffer, the server expects to receive more datagrams. setsockopt (sockfd,sol_socket,so_recvbuf,&n,sizeof (n)), in which n=220*1024, if run again, will find that the drops have improved (but no real solution).

So_rcvbu and SO_SNDBUF respectively set the receive buffer and the send buffer size.

TCP and UDP send and receive when TCP has a buffer or UDP buffer, when using them to pay attention to what?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.