Int setsockopt (
Socket s,
Int level,
Int optname,
Const char * optval,
Int optlen
);
S (socket): point to an open set Interface Description
Level: (level): Specifies the option code type.
Sol_socket: basic set of interfaces
Ipproto_ip: IPv4 Interface
Ipproto_ipv6: ipv6 Interface
Ipproto_tcp: TCP interface set
Optname (option name): Option name
Optval (option value): It is a pointer type pointing to a variable: integer, set interface structure, other structure types: Linger {}, timeval {}
Optlen (Option Length): optval size
Returned value: indicates that the binary option of a feature is enabled or disabled.
[/Code: 1: 59df4ce128]
========================================================== ======================================
Sol_socket
------------------------------------------------------------------------
So_broadcast allows sending broadcast data int
Applicable to UDP socket. The meaning is to allow UDP socket broadcast messages to the network.
So_debug allows int debugging
So_dontroute
So_error get socket error int
So_keepalive
Check whether the host of the other party crashes to prevent (the server) from blocking the input of the TCP connection forever.
After this option is set, if no data is exchanged in any direction of this interface within two hours, TCP automatically sends a keepalive probe shard (
Probe ). This is a TCP segment that the other party must respond to. It may cause the following three situations: the other party receives everything normally: expected
Ack response. 2 hours later, TCP sends out another detection shard. The other party has crashed and restarted: respond with RST. The interface to be processed is set to econnreset.
The port itself is closed. The other party has no response: the TCP sent from the Berkeley sends an additional eight test segments, one at a time of 75 seconds.
A response. If no response is returned after the first probe is sent for 11 minutes and 15 seconds, give up. Error pending for processing of the Set Interface
It is mistakenly set to etimeout, And the set interface itself is disabled. If the ICMP error is "Host Unreachable (the host is not
Accessible) "indicates that the host of the other party has not crashed, but is not reachable. In this case, the error to be handled is set to ehostunreach.
If so_dontlinger is true, the so_linger option is disabled.
So_linger delay disconnection struct linger
The preceding two options affect the close action.
Option interval close mode waiting for closing or not
So_dontlinger does not care about elegance or not
So_linger: Zero-force no
So_linger non-zero elegance is
If
Set so_linger (that is, set the l_onoff field in the linger structure to a non-zero value, see Section 2.4, 4.1.7 and 4.1.21), and set the zero timeout interval.
Closesocket () is not blocked and executed immediately, whether or not queued data is not sent or not confirmed. This method is called "forced" or "invalid", because the virtual circuit of the interface is immediately
Reset, And the unsent data is lost. A wsaeconnreset error occurs when remote Recv () is called.
If so_linger is set and the non-zero supertime is determined
Then, closesocket () calls the blocking process until all the data has been sent or timed out. This type of closure is called an "elegant" closure. Note that if the set interface is set to non-blocking and
If so_linger is set to a non-zero timeout value, closesocket () will return a wsaewouldblock error.
If you set
So_dontlinger (that is, set the l_onoff field of the linger structure to zero; see section 2.4, 4.1.7, 4.1.21 ),
Closesocket () is returned immediately. However, if possible, the queued data will be sent before the set interface is closed. Please note that in this case, the implementation of the Windows interface will be inaccurate
The Set APIs and other resources are retained within a specified period of time, which may affect applications that want to use the set APIs.
So_oobinline puts out-of-band data into normal data streams and receives out-of-band data int in normal Data Streams
So_rcvbuf receive buffer size int
Set the retention size of the receiving buffer.
It has nothing to do with so_max_msg_size or TCP sliding window. If the packet sent frequently, use this option.
So_sndbuf sending buffer size int
Set the size of the sending Buffer
It has nothing to do with so_max_msg_size or TCP sliding window. If the packet sent frequently, use this option.
Each
Each interface has a sending buffer and a receiving buffer. The receiving buffer is used by TCP and UDP to keep the received data for read by the application process.
TCP: The window size of TCP advertised to the other end. The receiving buffer of the TCP interface cannot overflow, because the other party cannot send data that exceeds the size of the advertised window.
This is the traffic control of TCP. If the recipient ignores the window size and sends out data that exceeds the Zhoukou size, the receiver TCP will discard it.
UDP: When the received data report is not included in the interface to receive the buffer, the datagram is discarded. UDP is not
Traffic control; fast senders can easily overwhelm slow receivers, causing the receiver's UDP to discard the datagram.
So_rcvlowat lower limit int of the receiving buffer
So_sndlowat lower limit int of the sending Buffer
Each
Each set of interfaces has a receiving low tide limit and a sending low tide limit. They are used by the selectt function,
The threshold for receiving low tide is to make the select return "readable" and the total amount of data required in the buffer zone received by the interface.
-- For a TCP or UDP interface, the default value is 1. The limit for sending low tide is to allow the SELECT statement to return "writable"
The buffer space required for sending requests through the interface. For TCP interfaces, the default value is 2048. When UDP is used,
Because the number of bytes in the available space in the sending buffer is never changed, as long as the size of the UDP interface sending buffer exceeds the low tide of the interface set, such UDP interface sets are always writable.
UDP has no sending buffer, only the size of the sending buffer.
So_rcvtimeo receiving timeout struct timeval
So_sndtimeo sending timeout struct timeval
So_reuseraddr allows reuse of the local address and port int
Bind the used address (or port number). For more information, see bind man.
So_exclusiveaddruse
In the exclusive mode, a port is used not to be shared with other programs using so_reuseaddr.
When determining who is used by multiple bindings, it is based on the principle that the package is submitted to the user with the most explicit designation, and there is no permission, that is to say, users with low-level permissions can be rebound to high-level permissions, such as the port on which the service starts. This is a major security risk,
If you do not want your program to be listened to, use this option.
So_type: Obtain the socket type Int.
So_bsdcompat is compatible with the BSD system int
========================================================== ========================================
Ipproto_ip
--------------------------------------------------------------------------
Ip_hdrincl contains the IP header int in the data package.
This option is often used by hackers to hide their IP addresses.
Ip_optinos IP header option int
Ip_tos service type
Ip_ttl time int
The following IPv4 Option is used for Multicast
IPv4 Option data type description
Ip_add_membership struct ip_mreq add to multicast group
Ip_rop_membership struct ip_mreq exit from multicast group
Ip_multicast_if struct ip_mreq specifies the interface for submitting multicast packets
Ip_multicast_ttl u_char specifies the TTL of the subscriber.
Ip_multicast_loop u_char makes the multicast loop valid or invalid
The ip_mreq structure is defined in the header file:
[Code: 1: 63724de67f]
Struct ip_mreq {
Struct in_addr imr_multiaddr;/* IP multicast address of Group */
Struct in_addr imr_interface;/* local IP address of interface */
};
[/Code: 1: 63724de67f]
To add a process to a multicast group, use the setsockopt () function of soket to send this option. The option type is ip_mreq structure. Its first field imr_multiaddr specifies the address of the multicast group, and the second field imr_interface specifies the IPv4 address of the interface.
Ip_drop_membership
This option is used to exit a multicast group. The data structure ip_mreq is used in the same way as above.
Ip_multicast_if
This option can modify the network interface and define a new interface in the structure ip_mreq.
Ip_multicast_ttl
Set the TTL (TTL) of the packets in the multicast packets ). The default value is 1, indicating that data packets can only be transmitted in the local subnet.
Ip_multicast_loop
A member in a multicast group also receives the packet sent to the group. This option is used to select whether to activate this status.
Double reply: 21:21:52
Ippro_tcp
--------------------------------------------------------------------------
Tcp_maxseg maximum TCP Data Segment Size int
Obtain
Obtains or sets the maximum partition size (MSS) of a TCP connection ). The returned value is the maximum value that TCP sends to the other end.
Data volume, which is often the MSS advertised by the other end using SYN, unless we select a ratio for TCP
The MSS value advertised by the other party is smaller. If this value is obtained before the set interface connection, the returned value is not from another end.
The default value used when the MSS option is received. Messages smaller than the returned value can be used in connections because
For example, if the timestamp option is used, it occupies 12 bytes of TCP option capacity on each shard. Our TCP will
The maximum data size of each shard sent can also be changed during the connection period, provided that TCP supports the path MTU discovery function. If the path to the other party changes, this value can be adjusted up or down.
Tcp_nodelay does not use the Nagle algorithm int
Specify the idle time of the connection in seconds before TCP starts sending and keep alive detection. The default value must be at least 7200 seconds, that is, 2 hours. This option is valid only when the so_kepalivee set interface option is enabled.
Tcp_nodelay and tcp_cork,
This
Both options play an important role in network connection. Many Unix systems have implemented the tcp_nodelay option. However, tcp_cork is unique to Linux systems.
And relatively new; it is first implemented in kernel version 2.4. In addition, other UNIX system versions have similar functions. It is worth noting that, in a BSD-derived system
The tcp_nopush option is actually a part of the specific implementation of tcp_cork.
Tcp_nodelay and tcp_cork basically control
"Nagle": Nagle indicates that the Nagle algorithm is used to assemble smaller packets into larger frames. John
Nagle is the inventor of the Nagle algorithm. The latter is named by his name, he used this method for the first time in 1984 to solve the network congestion problem of Ford Motor Company (for more information, see
Ietf rfc 896 ). He solved the so-called silly window syndrome.
The Chinese name is "stupid window syndrome". The specific meaning is that because every time a universal Terminal application generates a key operation, it will send a packet, in typical cases, the next packet will have a data load of one byte and 40
Packet header with a length of several bytes, resulting in 4000% overload, which can easily cause network congestion ,.
Nagle became a standard and was immediately implemented on the Internet. It has now become the default configuration, but in our opinion, it is also necessary to turn this option off in some cases.
Now
Let's assume that an application sends a request to send small pieces of data. We can choose to send data immediately or wait for more data to be generated and then send it again. If we send
Data, so interactive and customer/Server-type applications will greatly benefit. For example, when we are sending a short request and waiting for a large response, the related overload is compared with the total amount of data transmitted.
It will be relatively low, and if the request is sent immediately, the response time will be faster. The preceding operations can be completed by setting the tcp_nodelay option of the socket, so that the Nagle calculation is disabled.
Method.
In another case, we need to wait until the data size reaches the maximum to send all the data through the network. This data transmission method is beneficial to the communication performance of a large amount of data. A typical application is the file server.
Server. The application of the Nagle algorithm causes problems in this case. However, if you are sending a large amount of data, you can set the tcp_cork option to disable Nagle.
The opposite is tcp_nodelay (tcp_cork and tcp_nodelay are mutually exclusive ). Next let's take a closer look at its working principles.
Assume that the Application
Use the sendfile () function to transfer a large amount of data. Application Protocols usually require sending certain information to pre-interpret the data, which is actually the header content. In typical cases, the header is small and the socket
Tcp_nodelay is set on. Packets with headers will be transmitted immediately. In some cases (depending on the internal package counter), this packet is successfully received by the other party and needs to be confirmed by the other party. This
In this case, the transmission of a large amount of data will be postponed and unnecessary network traffic exchanges will occur.
However, if we set tcp_cork on the socket (it can be equivalent to inserting
"Plug-in") option, a packet with a header will fill in a large amount of data, and all data will be automatically transmitted through the package according to the size. When data transmission is complete, it is best to cancel tcp_cork
The option is set to "Remove the plug" so that any part of the frames can be sent out. This is equally important for "congested" network connections.
All in all, if you can certainly send multiple data sets together (such as the HTTP response header and body), we recommend that you set the tcp_cork option so that there is no latency between the data. It can greatly benefit the performance of WWW, FTP, and file servers, while also simplifying your work. The sample code is as follows:
Intfd, on = 1;
...
/* Create socket and other operations, which are omitted for space purposes */
...
Setsockopt (FD, sol_tcp, tcp_cork, & on, sizeof (on);/* cork */
Write (FD ,...);
Fprintf (FD ,...);
Sendfile (FD ,...);
Write (FD ,...);
Sendfile (FD ,...);
...
On = 0;
Setsockopt (FD, sol_tcp, tcp_cork, & on, sizeof (on);/* unplug the plug */
Unfortunately, many common programs do not consider the above issues. For example, Sendmail written by Eric Allman does not set any options for its socket.
Apache httpd
As the most popular Web server on the Internet, all its sockets are configured with the tcp_nodelay option, and its performance is also well received by most users. Why? The answer lies in reality.
Above the current differences. The TCP/IP protocol stack derived from BSD (FreeBSD is worth noting) has different operations in this situation. When tcp_nodelay
When a large number of small data blocks are submitted for transmission, a large amount of information will be sent by calling the write () function once. However, because the counter responsible for request delivery validation is byte oriented
Non-packet-oriented (on Linux), so the probability of latency introduction is much lower. The result is only related to the size of all data. Linux
After the first package arrives, you need to confirm that FreeBSD will wait for several hundred packages before doing so.
In Linux, the effect of tcp_nodelay is very different from that expected by developers who are used to the BSD TCP/IP protocol stack, and the Apache performance in Linux will be worse. Other applications that frequently use tcp_nodelay on Linux have the same problem.
Tcp_defer_accept
Me
The first 1st options to consider are tcp_defer_accept (this is the name of the Linux system, and some other operating systems also have the same options but use different names ). For management
To solve the specific idea of the tcp_defer_accept option, we need to outline the typical HTTP client/server interaction process. Recall how TCP and the target of Data Transmission
Establish a connection. On the network, the information transmitted between separated units is called an IP packet (or IP
Datagram ). A packet always has a header containing service information, which is used for internal protocol processing and can also carry data load. A typical example of service information is a set of so-called labels, which mark packets on behalf
Table special meanings of the TCP/IP protocol stack, such as successful packet confirmation. Generally, it is entirely possible to carry the load in a tagged packet, but sometimes the internal logic forces the TCP/IP protocol
The stack sends out an IP packet with only a packet header. These packages often cause annoying network latency and increase the system load. As a result, the overall network performance is reduced.
Now the server creates a socket with the same
Wait for the connection. The TCP/IP connection process is called "three handshakes ". First, the customer program sends a TCP packet (one SYN Packet) that sets the SYN flag without data load ). Server
The packet with SYN/ACK flag (a SYN/ACK packet) is used as the confirmation response of the packet received just now. The customer then sends an ACK packet to confirm that 2nd packets have been received and the connection has ended.
. After receiving the SYN/ACK packet from the customer, the server will wake up a receiving process waiting for data to arrive. After three handshakes are completed, the customer program starts to send "useful" data to the service.
. Generally, an HTTP request is very small and can be fully loaded into a package. However, in the above cases, at least four packets will be used for bidirectional transmission, which increases the latency.
. In addition, you must note that the recipient has begun waiting for information before "useful" data is sent.
To mitigate the impact of these problems, Linux (and some other
The tcp_defer_accept option is included in its TCP implementation. They are set on the server that listens to the socket. This option command kernel does not wait for the last ack package and
The listening process is initialized only when a package with real data arrives. After a SYN/ACK packet is sent, the server waits for the client program to send an IP packet containing data. Now, you only need to transfer three packets on the network,
It also significantly reduces the latency of connection establishment, especially for HTTP Communication.
This option has an equivalent in many operating systems. For example, on FreeBSD, the same behavior can be implemented using the following code:
/* For clarity, skip irrelevant code here */
Struct accept_filter_arg AF = {"dataready ",""};
Setsockopt (S, sol_socket, so_acceptfilter, & AF, sizeof (AF ));
This
Features are called "Accept filter" on FreeBSD and have multiple usage features. However, in almost all cases, the effect is the same as that of tcp_defer_accept: the server does not
Wait for the last ACK packet and only wait for the packet carrying the data load. For more information about this option and Its Significance to High-Performance Web servers, see the Apache documentation.
HTTP
For customer/server interaction, it is possible to change the behavior of the customer program. Why does the customer program send this "useless" Ack package? This is because the TCP stack cannot know the status of the ACK package.
If FTP is used instead of HTTP, the client program will not send data until it receives the data packet prompted by the FTP server. In this case, the delayed ack will delay the interaction between the client and server.
Late. To determine whether Ack is necessary, the customer program must know the application protocol and its current status. In this way, it is necessary to modify the customer behavior.
For Linux client programs, we can also
To use another option, it is also called tcp_defer_accept. We know that sockets are divided into two types: Listener sockets and connection sockets, so they also have their own
Set of TCP options. Therefore, the two options that are often used at the same time have the same name. After this option is set on the connection socket, the customer receives a SYN/ACK packet
Instead of sending ACK packets, the server waits for the next data request sent by the user program. Therefore, the packets sent by the server are reduced accordingly.
Tcp_quickack
Another way to stop delay caused by sending useless packets is to use the tcp_quickack option. This option corresponds
Different from tcp_defer_accept, tcp_defer_accept can be used not only to manage the connection establishment process, but also during normal data transmission. In addition, it can be set on either side of the client/server connection
. If you know that the data is about to be sent soon, the delay in sending the ACK package will come in handy, and it is best to set the ack on the data packet that carries the data.
To minimize the network load. When the sender confirms that the data will be sent immediately (multiple packets), tcp_quickack
Option can be set to 0. For sockets in the "connection" status, the default value of this option is 1. After the first use, the kernel will immediately reset this option to 1 (this is a one-time option ).
In some cases, it is very useful to issue an ACK package. The ack package will confirm the receipt of the data block, and the delay will not be introduced when the current one is processed. This data transmission mode is quite typical for the interaction process, because in such cases, users' input time cannot be predicted. In Linux, This is the default socket behavior.
In
In the above circumstances, the client program is sending an HTTP request to the server, and the request packet is very short in advance, so it should be sent immediately after the connection is established. This is a typical way of working with HTTP. Since no
To send a pure ACK packet, it is entirely possible to set tcp_quickack to 0 to improve performance. On the server side, both options can be set only once on the listening socket. All
Socket, that is, the socket indirectly created by the accepted call, inherits all the options of the original socket.
Tcp_cork, tcp_defer_accept, and
The combination of the tcp_quickack option will reduce the number of packets involved in each HTTP interaction to a minimum acceptable level (based on TCP protocol requirements and security considerations ). Result
It is faster data transmission and request processing speed, and minimizes the two-way latency of the customer/server.
Ii. Examples
1. closesocket (usually does not close immediately and goes through the time_wait process) to continue to reuse the socket:
Bool breuseaddr = true;
Setsockopt (S, sol_socket, so_reuseaddr, (const char *) & breuseaddr, sizeof (bool ));
2. If you want to force close a soket that is already in the connection status after you call closesocket
Time_wait process:
Bool bdontlinger = false;
Setsockopt (S, sol_socket, so_dontlinger, (const char *) & bdontlinger, sizeof (bool ));
3. In the send () and Recv () processes, sometimes due to network conditions and other reasons, sending and receiving cannot be performed as expected, but set the sending and receiving time limit:
Int nnettimeout = 1000; // 1 second
// Sending time limit
Setsockopt (socket, sol_s0cket, so_sndtimeo, (char *) & nnettimeout, sizeof (INT ));
// Receiving time limit
Setsockopt (socket, sol_s0cket, so_rcvtimeo, (char *) & nnettimeout, sizeof (INT ));
4. When sending (), the returned bytes are actually sent (synchronized) or the bytes sent to the socket buffer.
(Asynchronous); by default, the system sends and receives data in 8688 bytes (about 8.5 KB) at a time.
When receiving a large amount of data, you can set a socket buffer to avoid the continuous cyclic sending and receiving of send () and Recv:
// Receiving buffer
Int nrecvbuf = 32*1024; // set it to 32 K
Setsockopt (S, sol_socket, so_rcvbuf, (const char *) & nrecvbuf, sizeof (INT ));
// Sending Buffer
Int nsendbuf = 32*1024; // set it to 32 K
Setsockopt (S, sol_socket, so_sndbuf, (const char *) & nsendbuf, sizeof (INT ));
5. If you want to avoid the impact of copying data from the system buffer to the socket buffer when sending data
Program performance:
Int nzero = 0;
Setsockopt (socket, sol_s0cket, so_sndbuf, (char *) & nzero, sizeof (nzero ));
6. Complete the preceding functions in Recv (). By default, the socket buffer content is copied to the system buffer ):
Int nzero = 0;
Setsockopt (socket, sol_s0cket, so_rcvbuf, (char *) & nzero, sizeof (INT ));
7. Generally, when sending a UDP datagram, you want the data sent by the socket to have the broadcast feature:
Bool bbroadcast = true;
Setsockopt (S, sol_socket, so_broadcast, (const char *) & bbroadcast, sizeof (bool ));
8. When the client connects to the server, if the socket in non-blocking mode is in the connect () process
To set the connect () latency until accpet () is called (this function is set only when there is a significant non-blocking process)
Function does not play a major role in blocked function calls)
Bool bconditionalaccept = true;
Setsockopt (S, sol_socket, so_conditional_accept, (const char *) & bconditionalaccept, sizeof (bool ));
9. If closesocket () is called while sending data (sending () is not completed, and data is not sent ),
The general measure is to "calmly close" Shutdown (S, sd_both), but the data is definitely lost. How to set the program to meet specific requirements?
Application requirements (that is, disable the socket after sending the unsent data )?
Struct linger {
U_short l_onoff;
U_short l_linger;
};
Linger m_slinger;
M_slinger.l_onoff = 1; // (allowed to stay when closesocket () is called, but there is still data not sent)
// If m_slinger.l_onoff = 0, the function is the same as 2;
M_slinger.l_linger = 5; // (the allowable stay time is 5 seconds)
Setsockopt (S, sol_socket, so_linger, (const char *) & m_slinger, sizeof (linger); setsockopt () Usage
It comes from the Internet: 1. closesocket (usually does not close immediately and goes through the time_wait process) to continue to reuse the socket: Bool breuseaddr = true; Setsockopt (S, sol_socket, so_reuseaddr, (const char *) & breuseaddr, sizeof (bool )); 2. If you want to force close a soket that is already in the connection status after you call closesocket Time_wait process: Bool bdontlinger = false; Setsockopt (S, sol_socket, so_dontlinger, (const char *) & bdontlinger, sizeof (bool )); 3. In the send () and Recv () processes, sometimes due to network conditions and other reasons, sending and receiving cannot be performed as expected, but set the sending and receiving time limit: Int nnettimeout = 1000; // 1 second // Sending time limit Setsockopt (socket, sol_s0cket, so_sndtimeo, (char *) & nnettimeout, sizeof (INT )); // Receiving time limit Setsockopt (socket, sol_s0cket, so_rcvtimeo, (char *) & nnettimeout, sizeof (INT )); 4. When sending (), the returned bytes are actually sent (synchronized) or the bytes sent to the socket buffer. (Asynchronous); by default, the system sends and receives data in 8688 bytes (about 8.5 KB) at a time. When receiving a large amount of data, you can set a socket buffer to avoid the continuous cyclic sending and receiving of send () and Recv: // Receiving buffer Int nrecvbuf = 32*1024; // set it to 32 K Setsockopt (S, sol_socket, so_rcvbuf, (const char *) & nrecvbuf, sizeof (INT )); // Sending Buffer Int nsendbuf = 32*1024; // set it to 32 K Setsockopt (S, sol_socket, so_sndbuf, (const char *) & nsendbuf, sizeof (INT )); 5. If you want to avoid the impact of copying data from the system buffer to the socket buffer when sending data Program performance: Int nzero = 0; Setsockopt (socket, sol_s0cket, so_sndbuf, (char *) & nzero, sizeof (nzero )); 6. Complete the preceding functions in Recv (). By default, the socket buffer content is copied to the system buffer ): Int nzero = 0; Setsockopt (socket, sol_s0cket, so_rcvbuf, (char *) & nzero, sizeof (INT )); 7. Generally, when sending a UDP datagram, you want the data sent by the socket to have the broadcast feature: Bool bbroadcast = true; Setsockopt (S, sol_socket, so_broadcast, (const char *) & bbroadcast, sizeof (bool )); 8. When the client connects to the server, if the socket in non-blocking mode is in the connect () process To set the connect () latency until accpet () is called (this function is set only when there is a significant non-blocking process) Function does not play a major role in blocked function calls) Bool bconditionalaccept = true; Setsockopt (S, sol_socket, so_conditional_accept, (const char *) & bconditionalaccept, sizeof (bool )); 9. If closesocket () is called while sending data (sending () is not completed, and data is not sent ), The general measure is to "calmly close" Shutdown (S, sd_both), but the data is definitely lost. How to set the program to meet specific requirements? Application requirements (that is, disable the socket after sending the unsent data )? Struct linger { U_short l_onoff; U_short l_linger; }; Linger m_slinger; M_slinger.l_onoff = 1; // (allowed to stay when closesocket () is called, but there is still data not sent) // If m_slinger.l_onoff = 0, the function is the same as 2; M_slinger.l_linger = 5; // (the allowable stay time is 5 seconds) Setsockopt (S, sol_socket, so_linger, (const char *) & m_slinger, sizeof (linger )); |