setsockopt () usage (parameter details)

Source: Internet
Author: User
Tags emit sendfile socket error keep alive

Function: Sets the set of interfaces created by the socket to be detailed on the properties.

int setsockopt (
SOCKET S,
int level,
int optname,
Const char* Optval,
int Optlen
);

S (socket): points to an open set of interface descriptor words
Level: Specifies the type of the option code.
Sol_socket: Basic set of interfaces
Ipproto_ip:ipv4 Socket Connector
Ipproto_ipv6:ipv6 Socket Connector
IPPROTO_TCP:TCP Socket Connector
Optname (option name): Option name
Optval (option value): is a pointer to a variable type: shaping, socket structure, other structure type: linger{}, timeval{}
Optlen (option length): Size of Optval

Return value: Flag turns binary options on or off for a feature
[/code:1:59df4ce128]

========================================================================
Sol_socket
------------------------------------------------------------------------
So_broadcast allow sending broadcast data int
Applies to UDP sockets. The meaning is to allow UDP socket"broadcast" (broadcast) messages to the network.

So_debug Allow Debug int

So_dontroute does not find route int

So_error Getting socket error int

so_keepalive Stay Connected The int
detects if the host computer crashes, preventing (the server) from ever blocking the input of the TCP connection. When this option is set, TCP automatically sends a Keep Alive detection section (KeepAlive probe) to each other if there is no data exchange in either direction within 2 hours of this set of interfaces. This is a TCP sub-section that the other party must respond to. It causes the following three situations: the other receives everything normal: The ACK response in anticipation. After 2 hours, TCP will issue another probing subsection. The other party has crashed and restarted: respond with RST. The pending error of the socket is set to Econnreset, and the socket interface itself is closed. The other side has no response: TCP from Berkeley sends another 8 probing sections, 75 seconds apart, trying to get a response. Give up after the first detection sub-section 11分钟15秒 is still unresponsive. The pending error of the socket is set to etimeout, and the socket interface itself is closed. If the ICMP error is "host unreachable (Hosts unreachable)", the other host does not crash, but is unreachable, in which case the pending error is set to Ehostunreach.

If the So_dontlinger is true, the So_linger option is disabled.
So_linger delay closing connection struct LINGER
The above two options affect the close behavior
Option interval off mode wait off or not
So_dontlinger don't care about elegance No
So_linger 0 Mandatory No
So_linger not 0 Elegance is
If So_linger is set (that is, the L_onoff domain in the LINGER structure is set to nonzero, see 2.4,4.1.7 and 4.1.21 sections), and the 0 timeout interval is set, then Closesocket () is not blocked for immediate execution. Whether or not queued data is not sent or not acknowledged. This mode of shutdown is called "force" or "fail" shutdown because the virtual circuit of the socket is immediately reset and the unsent data is lost. The recv () call at the far end will fail with Wsaeconnreset.
If So_linger is set and a non-zero timeout interval is determined, closesocket () calls the blocking process until the remaining data is sent or timed out. This closure is called "graceful" closure. Note that if the socket is set to non-blocking and So_linger is set to a non-0 timeout, the closesocket () call will be returned with a wsaewouldblock error.
If So_dontlinger is set on a stream class socket interface (that is, the L_onoff domain of the linger structure is set to zero, see 2.4,4.1.7,4.1.21 section), the closesocket () call returns immediately. However, if possible, queued data is sent before the socket is closed. Note that in this case, the Windows Sockets implementation will retain the socket and other resources for an indeterminate period of time, which may have an impact on the application that is trying to use the socket interface.

So_oobinline out-of-band data into normal data streams, receiving out-of-band data in normal data streams int

SO_RCVBUF Receive buffer size int
To set the retention size of the receive buffer
Regardless of the so_max_msg_size or TCP sliding window, use this option if the packets that are typically sent are large and frequent

SO_SNDBUF Send buffer size int
Set the retention size of the send buffer
Regardless of the so_max_msg_size or TCP sliding window, use this option if the packets that are typically sent are large and frequent
Each set of interfaces has a send buffer and a receive buffer. The receive buffer is used by TCP and UDP to persist the received data until it is read by the application process. TCP:TCP advertises the window size at the other end. The TCP socket receive buffer cannot overflow because the other party is not allowed to emit more data than the advertised window size. This is the TCP traffic control, if the other side ignores the window size and issued more than the size of the data, the receiver TCP will discard it. UDP: This datagram is discarded when the received datagram is not loaded into the socket receive buffer. UDP is no traffic control, the fast sender can easily drown the slow receiver, causing the receiver's UDP drop datagram.

So_rcvlowat receive buffer lower bound int
So_sndlowat send buffer lower bound int
Each set of interfaces has a receiving low tide limit and a send low ebb limit. They are used by the function SELECTT, and receive a low tide limit is the amount of data that must be in the receive buffer for select to return "readable" to the socket. --For a TCP or UDP socket interface, this value defaults to 1. Sending a low ebb limit is the amount of free space that is required to have the select return "writable" and in the socket send buffer. For TCP sockets, this value is usually 2048 by default. For UDP using a low tide limit, because the number of bytes of free space in its send buffer is never changed, the UDP socket interface is always writable as long as the UDP socket send buffer size is larger than the low tide limit of the socket. UDP has no send buffer, only the size of the send buffer.

So_rcvtimeo Receive timeout struct timeval
So_sndtimeo Send timeout struct timeval
SO_REUSERADDR allow reuse of local addresses and port int
To bind a used address (or port number), refer to the Bind man

So_exclusiveaddruse
Exclusive mode uses a port, which is not allowed and other programs use SO_REUSEADDR shared using one of the ports.
When determining the use of multiple bindings who, according to a principle of who is the most clear of the delivery of the package to whom, and do not have permission points, that is, the low-level permissions of the user can be re-bound in the advanced permissions such as the service started on the port, which is a very significant security risk,
If you do not want your program to be monitored, use this option

So_type Get socket type int
So_bsdcompat compatible with BSD systems int

==========================================================================
Ipproto_ip
--------------------------------------------------------------------------
IP_HDRINCL contains the IP header int in the packet
This option is often used in hacking techniques to hide your IP address

Ip_optinos IP Header option int
Ip_tos Service Type
Ip_ttl Time to live int

The following IPV4 options are used for multicast
IPV4 option Data type description
ip_add_membership struct ip_mreq join to the multicast group
ip_rop_membership struct Ip_mreq exit from the multicast group
Ip_multicast _if struct IP_MREQ Specifies the interface of the submission group broadcast
Ip_multicast_ttl U_char Specifies the TTL of the submission group broadcast text
Ip_multicast_loop U_char make the group broadcast text loop valid or invalid
The IP_MREQ structure is defined in the header file:
[code:1:63724de67f]
struct Ip_mreq {
struct in_addr imr_multiaddr;/* IP multicast address of Group */
struct IN_ADDR imr_interface;/* Local IP Address of interface */
};
[/code:1:63724de67f]
If the process is to be joined to a multicast group, use the Soket setsockopt () function to send the option. The option type is the IP_MREQ structure, its first field imr_multiaddr specifies the address of the multicast group, and the second field imr_interface specifies the IPV4 address of the interface.
Ip_drop_membership
This option to exit from a multicast group. The data structure ip_mreq is used in the same way as above.
ip_multicast_if
This option modifies the network interface and defines the new interface in the structure ip_mreq.
Ip_multicast_ttl
Sets the TTL (time-to-live) of packets for the group broadcast. The default value is 1, which means that packets can only be delivered in a local subnet. The members of the
Ip_multicast_loop
Multicast group will themselves receive the messages it sends to this group. This option is used to select whether to activate this state.

Matchless reply at: 2003-05-08 21:21:52

Ippro_tcp
--------------------------------------------------------------------------
Tcp_maxseg the size of the TCP maximum data segment int
Gets or sets the maximum section size (MSS) for a TCP connection. The return value is the maximum amount of data that our TCP sends to the other end, which is often the MSS advertised by the other end with a SYN subsection, unless our TCP chooses to use a value smaller than the one advertised by the other. If this value is obtained before the socket is connected, the return value is the default value that is used in cases where the MSS option is not received from the other-end. A letter smaller than this return value may actually be used on the connection, because, for example, with the timestamp option, it consumes 12 bytes of TCP option capacity on each section. The maximum amount of data that our TCP will send for each section can also change over the lifetime of the connection, provided that TCP supports the Path MTU Discovery feature. If the path to each other changes, this value can be adjusted up or down.
Tcp_nodelay does not use the Nagle algorithm int

Specifies the connection idle time in seconds before TCP begins sending a keep Alive Probe sub-section. The default value must be at least 7,200 seconds, or 2 hours. This option is only valid if the SO_KEPALIVEE socket option is turned on.

Tcp_nodelay and Tcp_cork,
Both of these options play an important role in the behavior of the network connection. Many UNIX systems implement the Tcp_nodelay option, but the tcp_cork is unique to the Linux system and relatively new; it is first implemented on kernel version 2.4. In addition, other Unix system versions have similar options, and it's worth noting that the Tcp_nopush option on some BSD-derived system is actually part of the tcp_cork implementation.
Tcp_nodelay and Tcp_cork basically control the "Nagle" of the package, which is what Nagle means by using the Nagle algorithm to assemble smaller packages into larger frames. John Nagle, the inventor of the Nagle algorithm, was named after his first attempt to solve the network congestion problem for Ford Motor Company in 1984 (see IETF RFC 896 for more details). The problem he solves is the so-called silly window syndrome, which is called "Stupid windowing syndrome", meaning that because a universal terminal application sends a packet each time a keystroke is generated, a packet typically has a byte of data payload and a 40-byte header, The result is a 4,000% overload, which can easily cause congestion on the network. Nagle later became a standard and was immediately implemented on the Internet. It has now become the default configuration, but in our opinion, it is desirable to turn this option off in some situations.
Now let's assume that an application makes a request to send a small chunk of data. We can choose to send the data immediately or wait for more data to be generated and then send the two policies again. If we send data right away, interactive and client/server-based applications will greatly benefit. For example, when we are sending a short request and waiting for a larger response, the associated overload is relatively low compared to the amount of data transferred, and the response time is faster if the request is sent immediately. The above actions can be done by setting the Tcp_nodelay option for sockets, which disables the Nagle algorithm.
Another situation requires us to wait until the maximum amount of data sent over the network once all the data, this data transmission is beneficial to a large number of data communication performance, the typical application is the file server. Applying the Nagle algorithm will cause problems in this case. However, if you are sending large amounts of data, you can set the Tcp_cork option to disable Nagle, in a way that is exactly the opposite of Tcp_nodelay (Tcp_cork and Tcp_nodelay are mutually exclusive). Let's examine how it works.
Suppose the application uses the Sendfile () function to transfer large amounts of data. Application protocols usually require the sending of certain information to pre-interpret the data, which is actually the header content. Typically, the header is small and tcp_nodelay is set on the socket. Packets with headers will be transferred immediately, in some cases (depending on the internal packet counter), since this package is successfully received by the other party and needs to be confirmed by the other party. In this way, the transfer of large amounts of data is deferred and unnecessary network traffic is exchanged.
However, if we set the tcp_cork on the socket (which can be likened to inserting a "plug" on the pipe), the packet with the header fills a large amount of data, and all the data is automatically transmitted through the packet according to size. When the data transfer is complete, it is best to cancel the Tcp_cork option setting to the connection "unplug the plug" so that any part of the frame can be sent out. This is as important as "plug-in" network connectivity.
All in all, if you are sure to be able to send multiple data sets together (such as the header and body of the HTTP response), we recommend that you set the tcp_cork option so that there is no delay between the data. Can greatly benefit the performance of WWW, FTP, and file servers while simplifying your work. The sample code is as follows:

INTFD, on = 1;
...
/* Here is an operation such as creating a socket, omitted for the sake of space */
...
SetSockOpt (FD, SOL_TCP, tcp_cork, &on, sizeof (on)); /* Cork */
Write (fd, ...);
fprintf (fd, ...);
Sendfile (fd, ...);
Write (fd, ...);
Sendfile (fd, ...);
...
on = 0;
SetSockOpt (FD, SOL_TCP, tcp_cork, &on, sizeof (on)); /* Unplug the stopper */

Unfortunately, many of the commonly used programs do not take into account the above problems. For example, SendMail written by Eric Allman does not have any options set on its socket.

Apache HTTPD is the most popular Web server on the Internet, with all of its sockets set with the Tcp_nodelay option, and its performance is well received by most users. What is this for? The answer lies in the difference between implementation. The BSD-derived TCP/IP protocol stack (notably FreeBSD) operates differently in this situation. When a large number of small data block transmissions are committed in Tcp_nodelay mode, a large amount of information is sent out in the same way that a write () function call sends a piece of data. However, because the registers responsible for requesting delivery confirmations are byte-oriented rather than packet-oriented (on Linux), the probability of introducing delays is much lower. The results are only related to the size of all the data. While Linux requires confirmation after the first packet arrives, FreeBSD waits hundreds of packets before doing so.

On Linux systems, the effects of tcp_nodelay are very different from those used by developers accustomed to the BSD TCP/IP stack, and the performance of Apache on Linux will be even worse. Other applications that frequently use Tcp_nodelay on Linux have the same problem.

Tcp_defer_accept

The 1th option we'll consider first is tcp_defer_accept (this is the name on the Linux system, and some of the other operating systems have the same options but use different names). To understand the specific idea of the tcp_defer_accept option, it is necessary to outline a typical HTTP client/server interaction process. Recall how TCP is connected to the destination of the transmitted data. On a network, the information that is transferred between detached units is called an IP packet (or IP datagram). A package always has a header that carries service information, Baotou is used for internal protocol processing, and it can also carry data loads. A typical example of a service message is a set of so-called flags that represent a special meaning in the TCP/IP stack, such as a successful acknowledgement of a packet receipt, and so on. In general, it is entirely possible to carry the payload in a "tagged" package, but sometimes internal logic forces the TCP/IP stack to emit only the packet of packets in the packet header. These packages often cause annoying network delays and increase the load on the system, resulting in overall network performance degradation.
Now the server creates a socket and waits for a connection. The TCP/IP connection process is the so-called "3-time handshake". First, the client program sends a TCP packet (a SYN packet) that sets the SYN flag and does not have a data payload. The server then emits a packet with the SYN/ACK flag (a syn/ack packet) as the acknowledgment response to the packet just received. The customer then sends an ACK packet confirming the receipt of the 2nd package thereby ending the connection process. After receiving the Syn/ack packet from the client, the server wakes up a receiving process to wait for the data to arrive. When the 3 handshake is complete, the client program starts sending "useful" data to the server. Typically, the amount of an HTTP request is very small and can be loaded into a package entirely. However, in the above case, at least 4 packets will be used for two-way transmission, which increases the considerable delay time. In addition, you have to note that before the "useful" data is sent, the receiver has begun to wait for the message.
To mitigate the impact of these issues, Linux (and some other operating systems) includes the TCP_DEFER_ACCEPT option in its TCP implementation. They are set on the server side of the listening socket, which commands the kernel not to wait for the final ACK packet and to initialize the listening process until the 1th packet arrives with a real data. After the Syn/ack packet is sent, the server waits for the client program to send the IP packet with the data. Now, you only need to transfer 3 packets over the network, and significantly reduce the latency of connection establishment, especially for HTTP traffic.
This option has a corresponding peer on a number of operating systems. For example, on FreeBSD, the same behavior can be implemented with the following code:

/* For clarity, irrelevant code is omitted here */
struct Accept_filter_arg af = {"Dataready", ""};
SetSockOpt (S, Sol_socket, So_acceptfilter, &AF, sizeof (AF));
This feature is called "Accept filter" on FreeBSD and has many uses. In almost all cases, however, the effect is the same as tcp_defer_accept: The server does not wait for the last ACK packet but only waits for the packet to carry the data payload. To learn more about this option and its significance for high-performance Web servers, refer to the Apache documentation for more information.
In the case of HTTP client/server interactions, it is possible to change the behavior of the client program. Why does the client program send this "useless" ACK packet? This is because the TCP stack cannot know the state of the ACK packet. If you are using FTP instead of HTTP, the client will not send the data until it receives the packets prompted by the FTP server. In this case, the delayed ACK causes a delay in client/server interaction. To determine if an ACK is necessary, the client program must know the application protocol and its current state. This makes it necessary to modify the customer's behavior.
For Linux clients, we can also use another option, which is also called tcp_defer_accept. We know that sockets are divided into two types, listening sockets and connecting sockets, so they each have a corresponding set of TCP options. Therefore, it is entirely possible to have the same name for both types of options that are used at the same time. When this option is set on a connection socket, the customer no longer sends an ACK packet after receiving a syn/ack packet, but waits for the next data request from the user's program, so the packet sent by the server is reduced accordingly.

Tcp_quickack

Another way to prevent delays caused by sending unwanted packets is to use the Tcp_quickack option. This option differs from tcp_defer_accept in that it can be used as a process for managing connection creation and during normal data transfer. In addition, it can be set on either side of the client/server connection. If you know that the data is about to be sent soon, it will be useful to delay sending the ACK packets, and it is best to set the ACK flag on the packet that carries the data to minimize the network load. When the sender is sure that the data will be sent immediately (multiple packages), the Tcp_quickack option can be set to 0. The default value for this option is 1 for sockets that are in the connected state, and the kernel resets the option to 1 immediately after the first use (this is a one-time option).
In some cases, it is useful to emit ACK packets. The ACK packet confirms the receipt of the data block, and the next piece is processed without introducing a delay. This mode of data transfer is quite typical for the interaction process, because the user's input time is unpredictable in such cases. On a Linux system this is the default socket behavior.
In these cases, the client program sends an HTTP request to the server, knowing that the request packet is short so that it should be sent immediately after the connection is established, which can be described as a typical way of working with HTTP. Since there is no need to send a pure ACK packet, it is entirely possible to set Tcp_quickack to 0 to improve performance. On the server side, both options can be set only once on the listening socket. All sockets, that is, the sockets created indirectly by the accepted call, inherit all the options of the original socket.
With the combination of Tcp_cork, tcp_defer_accept, and tcp_quickack options, the number of packets participating in each HTTP interaction is reduced to the minimum acceptable level (based on TCP protocol requirements and security considerations). The result is not only faster data transfer and request processing speed but also minimized client/server bidirectional latency.

Second, use examples to illustrate

1.closesocket (typically does not close immediately and undergoes the time_wait process) to continue to reuse the socket:
BOOL breuseaddr=true;
SetSockOpt (S,sol_socket, SO_REUSEADDR, (const char*) &breuseaddr,sizeof (BOOL));
2. If you want a soket that is already in the connected state to be forced to close after calling Closesocket, do not experience
The process of time_wait:
BOOL Bdontlinger = FALSE;
SetSockOpt (S,sol_socket,so_dontlinger, (const char*) &bdontlinger,sizeof (BOOL));
3. In the Send (), recv () process sometimes due to network conditions and other reasons, the collection can not be expected to proceed, and set the time and delivery period:
int NNETTIMEOUT=1000;//1 sec
Delivery time limit
SetSockOpt (Socket,sol_s0cket,so_sndtimeo, (char *) &nnettimeout,sizeof (int));
Receiving time limit
SetSockOpt (Socket,sol_s0cket,so_rcvtimeo, (char *) &nnettimeout,sizeof (int));
4. In Send (), the actual bytes sent (synchronous) or bytes sent to the socket buffer are returned.
(asynchronous); The system default state send and Receive is 8688 bytes (approximately 8.5K), and data is sent in the actual process.
and receive a large amount of data, you can set the socket buffer, and avoid the Send (), recv () Continuous loop transceiver:
Receive buffers
int nrecvbuf=32*1024;//set to 32K
SetSockOpt (S,sol_socket,so_rcvbuf, (const char*) &nrecvbuf,sizeof (int));
Send buffer
int nsendbuf=32*1024;//set to 32K
SetSockOpt (S,sol_socket,so_sndbuf, (const char*) &nsendbuf,sizeof (int));
5. If you do not want to experience a copy of the system buffer to the socket buffer when sending the data
Performance of the program:
int nzero=0;
SetSockOpt (Socket,sol_s0cket,so_sndbuf, (char *) &nzero,sizeof (Nzero));
6. Complete the above function (by default, copy the contents of the socket buffer to the system buffer) in recv ():
int nzero=0;
SetSockOpt (Socket,sol_s0cket,so_rcvbuf, (char *) &nzero,sizeof (int));
7. In general, when sending a UDP datagram, you want the data sent by the socket to have broadcast characteristics:
BOOL bbroadcast=true;
SetSockOpt (S,sol_socket,so_broadcast, (const char*) &bbroadcast,sizeof (BOOL));
8. In the Client connection server process, if the socket in non-blocking mode is in the process of connect (), you can
To set the Connect () delay until Accpet () is called (this function is set only in the non-blocking process with significant
function, which has little effect in blocking functions)
BOOL bconditionalaccept=true;
SetSockOpt (s,sol_socket,so_conditional_accept, (const char*) &bconditionalaccept,sizeof (BOOL));
9. If in the process of sending the data (send () is not completed, and the data is not sent), Closesocket () was called, before we
The general measures taken are "calmly shut down" shutdown (s,sd_both), but the data is definitely lost and how to set the program to meet specific
Application requirements (that is, to let the data not sent out after the socket is closed)?
struct Linger {
U_short L_onoff;
U_short L_linger;
};
Linger M_slinger;
m_slinger.l_onoff=1;//(in the closesocket () call, but there is no data sent when the time allowed to stay)
If m_slinger.l_onoff=0, then function and 2.) function the same;
m_slinger.l_linger=5;//(allow 5 seconds to stay)
SetSockOpt (S,sol_socket,so_linger, (const char*) &m_slinger,sizeof (LINGER));

Http://blog.csdn.net/l_yangliu

http://blog.csdn.net/chary8088/article/details/2486377

setsockopt () usage (parameter details)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.