From: http://lydnkj.bokee.com/viewdiary.10417372.html
Getsockopt and setsockopt
Obtain the set Interface Options:
Code:
Int getsockopt (INT sockfd, int level, int optname, void * optval, socklen_t * opteln) sets the set of Interface Options:
Int setsockopt (INT sockfd, int level, int optname, const void * optval, socklen_t * opteln)
Sockfd (socket): Specifies the description of an open set of interfaces.
Level: (level): Specifies the option code type.
Sol_socket: basic set of interfaces
Ipproto_ip: IPv4 Interface
Ipproto_ipv6: ipv6 Interface
Ipproto_tcp: TCP interface set
Optname (option name): Option name
Optval (option value): It is a pointer type pointing to a variable: integer, set interface structure, other structure types: Linger {}, timeval {}
Optlen (Option Length): optval size
Returned value: indicates that the binary option of a feature is enabled or disabled.
The following methods are described as follows:
Option name brief description Data Type
Supplement
========================================================== ======================================
Sol_socket
------------------------------------------------------------------------
So_broadcast allows sending broadcast data int
Applicable to UDP socket. The meaning is to allow UDP socket broadcast messages to the network.
So_debug allows int debugging
So_dontroute
So_error get socket error int
So_keepalive
Check whether the host of the other party crashes to prevent (the server) from blocking the input of the TCP connection forever. After this option is set, if no data is exchanged in any direction of this interface within two hours, TCP automatically sends a keepalive probe to the other side ). This is a TCP segment that the other party must respond to. It may cause the following three situations: the other party receives everything normally: The expected ack response. 2 hours later, TCP sends out another detection shard. The other party has crashed and restarted: respond with RST. The interface to be processed is set to econnreset, and the interface itself is closed. The other party has no response: TCP sent from the Berkeley sends an additional eight detection segments, one in 75 seconds, and tries to get a response. If no response is returned after the first probe is sent for 11 minutes and 15 seconds, give up. The processing error of the Set interface is set to etimeout by mistake, and the Set interface itself is disabled. For example, if the ICMP error is "Host Unreachable (host inaccessible)", it indicates that the host of the other party has not crashed but is not reachable. In this case, the error to be handled is set to ehostunreach.
If so_dontlinger is true, the so_linger option is disabled.
So_linger delay disconnection struct linger
The preceding two options affect the close action.
Option interval close mode waiting for closing or not
So_dontlinger does not care about elegance or not
So_linger: Zero-force no
So_linger non-zero elegance is
If so_linger is set (that is, the l_onoff field in the linger structure is set to non-zero, see Section 2.4, 4.1.7 and 4.1.21), and the zero timeout interval is set, closesocket () run immediately without being blocked, whether or not there are queued data not sent or not confirmed. This method is called "forced" or "invalid", because the virtual circuit of the Set interface is reset immediately and unsent data is lost. A wsaeconnreset error occurs when remote Recv () is called.
If so_linger is set and a non-zero timeout interval is determined, closesocket () calls the blocking process until all data has been sent or timed out. This type of closure is called an "elegant" closure. Note that if the set interface is set to non-blocking and so_linger is set to a non-zero timeout value, closesocket () will return a wsaewouldblock error.
If so_dontlinger is set on a stream class interface (that is, the l_onoff field of the linger structure is set to zero; see section 2.4, 4.1.7, 4.1.21), closesocket () is called and returned immediately. However, if possible, the queued data will be sent before the set interface is closed. Note that in this case, the Windows interface will retain the interface set and other resources for a period of uncertain time, this has some impact on applications that want to use the APIs.
So_oobinline puts out-of-band data into normal data streams and receives out-of-band data int in normal Data Streams
So_rcvbuf receive buffer size int
Set the retention size of the receiving buffer.
It has nothing to do with so_max_msg_size or TCP sliding window. If the packet sent frequently, use this option.
So_sndbuf sending buffer size int
Set the size of the sending Buffer
It has nothing to do with so_max_msg_size or TCP sliding window. If the packet sent frequently, use this option.
Each set of interfaces has a sending buffer and a receiving buffer. The receiving buffer is used by TCP and UDP to keep the received data for read by the application process. TCP: The window size of TCP advertised to the other end. The receiving buffer of the TCP interface cannot overflow, because the other party cannot send data that exceeds the size of the advertised window. This is the traffic control of TCP. If the recipient ignores the window size and sends out data that exceeds the Zhoukou size, the receiver TCP will discard it. UDP: When the received data report is not included in the interface to receive the buffer, the datagram is discarded. There is no traffic control for UDP. A fast sender can easily drown out slow recipients, causing the receiver to discard the UDP datagram.
So_rcvlowat lower limit int of the receiving buffer
So_sndlowat lower limit int of the sending Buffer
Each set of interfaces has a receiving low tide and a sending low tide. They are used by the function selectt, and the receiving low tide limit is to make the select return "readable" and the total amount of data required in the buffer zone received by the interface. -- For a TCP or UDP interface, the default value is 1. Sending low tide limit is to allow the SELECT statement to return "writable", and the available space is required in the interface sending buffer. For TCP interfaces, the default value is 2048. For the low-tide limit of UDP usage, because the number of bytes in the available space in the sending buffer is never changed, as long as the buffer size of the UDP interface sending is greater than the low-tide limit of the Set interface, this UDP interface is always writable. UDP has no sending buffer, only the size of the sending buffer.
So_rcvtimeo receiving timeout struct timeval
So_sndtimeo sending timeout struct timeval
So_reuseraddr allows reuse of the local address and port int
Bind the used address (or port number). For more information, see bind man.
So_exclusiveaddruse
In the exclusive mode, a port is used not to be shared with other programs using so_reuseaddr.
When determining who is used by multiple bindings, it is based on the principle that the package is submitted to the user with the most explicit designation, and there is no permission, that is to say, users with low-level permissions can be rebound to high-level permissions, such as the port on which the service starts. This is a major security risk,
If you do not want your program to be listened to, use this option.
So_type: Obtain the socket type Int.
So_bsdcompat is compatible with the BSD system int
========================================================== ========================================
Ipproto_ip
--------------------------------------------------------------------------
Ip_hdrincl contains the IP header int in the data package.
This option is often used by hackers to hide their IP addresses.
Ip_optinos IP header option int
Ip_tos service type
Ip_ttl time int
The following IPv4 Option is used for Multicast
IPv4 Option data type description
Ip_add_membership struct ip_mreq add to multicast group
Ip_rop_membership struct ip_mreq exit from multicast group
Ip_multicast_if struct ip_mreq specifies the interface for submitting multicast packets
Ip_multicast_ttl u_char specifies the TTL of the subscriber.
Ip_multicast_loop u_char makes the multicast loop valid or invalid
The ip_mreq structure is defined in the header file:
Code:
Struct ip_mreq {
Struct in_addr imr_multiaddr;/* IP multicast address of Group */
Struct in_addr imr_interface;/* local IP address of interface */
};
To add a process to a multicast group, use the setsockopt () function of soket to send this option. The option type is ip_mreq structure. Its first field imr_multiaddr specifies the address of the multicast group, and the second field imr_interface specifies the IPv4 address of the interface.
Ip_drop_membership
This option is used to exit a multicast group. The data structure ip_mreq is used in the same way as above.
Ip_multicast_if
This option can modify the network interface and define a new interface in the structure ip_mreq.
Ip_multicast_ttl
Set the TTL (TTL) of the packets in the multicast packets ). The default value is 1, indicating that data packets can only be transmitted in the local subnet.
Ip_multicast_loop
A member in a multicast group also receives the packet sent to the group. This option is used to select whether to activate this status.
========================================================== ========================================
Ippro_tcp
--------------------------------------------------------------------------
Tcp_maxseg maximum TCP Data Segment Size int
Obtain or set the maximum partition size (MSS) of the TCP connection ). The returned value is the maximum data volume that our TCP sends to the other end. It is often the MSS advertised by the other end using syn, unless we select a value smaller than the MSS advertised by the other party for TCP. If this value is obtained before the set interface connection, the returned value is the default value when the MSS option is not received from the other end. A message smaller than the returned value can be used for connections, because a token occupies 12 bytes of TCP option capacity in each shard if the timestamp option is used. The maximum data size of each shard sent by TCP can also be changed during the connection period, provided that TCP supports the path MTU discovery function. If the path to the other party changes, this value can be adjusted up or down.
Tcp_nodelay does not use the Nagle algorithm int
Specify the idle time of the connection in seconds before TCP starts sending and keep alive detection. The default value must be at least 7200 seconds, that is, 2 hours. This option is valid only when the so_kepalivee set interface option is enabled.
Tcp_nodelay and tcp_cork,
Both options play an important role in network connection. Many Unix systems have implemented the tcp_nodelay option. However, tcp_cork is unique to Linux systems and is relatively new. It is first implemented in kernel version 2.4. In addition, other UNIX system versions have similar functions. It is worth noting that the tcp_nopush option on a BSD-derived system is actually part of the specific implementation of tcp_cork.
Tcp_nodelay and tcp_cork basically control the "Nagle" of the package. The meaning of Nagle here is that the Nagle algorithm is used to assemble a smaller package into a larger frame. John Nagle is the inventor of the Nagle algorithm. The latter is named by his name, he used this method for the first time in 1984 to solve the network congestion problem of Ford Motor Corporation (For details, refer to ietf rfc 896 ). The problem he solved is the so-called silly window syndrome, which is called the "stupid window syndrome" in Chinese. The specific meaning is that every time a universal Terminal application generates a key operation, it will send a packet, in typical cases, a packet has a data load of one byte and a 40-byte long packet header, resulting in 4000% overload, which can easily cause network congestion ,. Nagle became a standard and was immediately implemented on the Internet. It has now become the default configuration, but in our opinion, it is also necessary to turn this option off in some cases.
Now let's assume that an application sends a request to send small pieces of data. We can choose to send data immediately or wait for more data to be generated and then send it again. If we send data immediately, our interactive and customer/server applications will be greatly benefited. For example, when we are sending a short request and waiting for a large response, the associated overload will be lower than the total amount of data transmitted, and, if the request is sent immediately, the response time will be faster. You can set the tcp_nodelay option of the socket to disable the Nagle algorithm.
In another case, we need to wait until the data size reaches the maximum to send all the data through the network. This data transmission method is beneficial to the communication performance of a large amount of data. A typical application is the file server. The application of the Nagle algorithm causes problems in this case. However, if you are sending a large amount of data, you can set the tcp_cork option to disable Nagle. The method is exactly the same as that of tcp_nodelay (tcp_cork and tcp_nodelay are mutually exclusive ). Next let's take a closer look at its working principles.
Assume that the application uses the sendfile () function to transfer a large amount of data. Application Protocols usually require sending certain information to pre-interpret the data, which is actually the header content. In typical cases, the header is small and tcp_nodelay is configured on the concatenation. Packets with headers will be transmitted immediately. In some cases (depending on the internal package counter), this packet is successfully received by the other party and needs to be confirmed by the other party. In this way, the transmission of a large amount of data will be postponed and unnecessary network traffic exchange will occur.
However, if we set the tcp_cork option on the socket (which may be equivalent to inserting a "plug-in" on the pipeline), a packet with a header will fill in a large amount of data, all data is automatically transmitted through the package according to the size. When the data transmission is complete, it is best to cancel the tcp_cork option setting to "Remove the plug" for the connection so that any part of the frames can be sent out. This is equally important for "congested" network connections.
All in all, if you can certainly send multiple data sets together (such as the HTTP response header and body), we recommend that you set the tcp_cork option so that there is no latency between the data. It can greatly benefit the performance of WWW, FTP, and file servers, while also simplifying your work. The sample code is as follows:
Intfd, on = 1;
...
/* Create socket and other operations, which are omitted for space purposes */
...
Setsockopt (FD, sol_tcp, tcp_cork, & on, sizeof (on);/* cork */
Write (FD ,...);
Fprintf (FD ,...);
Sendfile (FD ,...);
Write (FD ,...);
Sendfile (FD ,...);
...
On = 0;
Setsockopt (FD, sol_tcp, tcp_cork, & on, sizeof (on);/* unplug the plug */
Unfortunately, many common programs do not consider the above issues. For example, Sendmail written by Eric Allman does not set any options for its socket.
Apache httpd is the most popular Web server on the Internet. All its sockets are configured with the tcp_nodelay option, and its performance is well received by most users. Why? The answer lies in the difference in implementation. The TCP/IP protocol stack derived from BSD (FreeBSD is worth noting) has different operations in this situation. When a large number of small data blocks are submitted for transmission in tcp_nodelay mode, a large amount of information is sent by calling the write () function once. However, because the record responsible for request delivery validation is byte-oriented rather than packet-oriented (on Linux), the probability of latency introduction is much lower. The result is only related to the size of all data. Linux requires confirmation after the first package arrives, and FreeBSD will wait for several hundred packages before doing so.
In Linux, the effect of tcp_nodelay is very different from that expected by developers who are used to the BSD TCP/IP protocol stack, and the Apache performance in Linux will be worse. Other applications that frequently use tcp_nodelay on Linux have the same problem.
Tcp_defer_accept
The first 1st options we will consider are tcp_defer_accept (this is the name of the Linux system, and some other operating systems also have the same options but use different names ). To understand the specific ideas of the tcp_defer_accept option, it is necessary to give a general description of the typical HTTP client/server interaction process. Recall how TCP establishes a connection with the destination for data transmission. On the network, the information transmitted between separated units is called an IP packet (or an IP datagram ). A packet always has a header containing service information, which is used for internal protocol processing and can also carry data load. A typical example of service information is a set of so-called labels that represent the special meanings in the TCP/IP protocol stack of the table, such as the successful confirmation of packets received. Generally, it is entirely possible to carry the load in a tagged packet, but sometimes the internal logic forces the TCP/IP protocol stack to issue an IP packet with only a packet header. These packages often cause annoying network latency and increase the system load. As a result, the overall network performance is reduced.
Now the server has created a set of characters waiting for connection at the same time. The TCP/IP connection process is called "three handshakes ". First, the customer program sends a TCP packet (one SYN Packet) that sets the SYN flag without data load ). The server sends a packet with the SYN/ack mark (a SYN/ACK packet) as the confirmation response of the packet received just now. The customer then sends an ACK packet to confirm that 2nd packets are received, thus terminating the connection process. After receiving the SYN/ACK packet from the customer, the server will wake up a receiving process waiting for data to arrive. After three handshakes are completed, the customer program starts to send "useful" data to the server. Generally, an HTTP request is very small and can be fully loaded into a package. However, in the above cases, at least four packets will be used for bidirectional transmission, which increases the latency. In addition, you must note that the recipient has begun waiting for information before "useful" data is sent.
To mitigate the impact of these problems, Linux (and some other operating systems) includes the tcp_defer_accept option in its TCP implementation. They are set on the server that listens to the socket. the kernel of this option does not initialize the listening process until the last ACK packet is reached and the 1st packets with real data arrive. After a SYN/ACK packet is sent, the server waits for the client program to send an IP packet containing data. Now, only three packets need to be transmitted on the network, and the connection establishment delay is significantly reduced, especially for HTTP Communication? /