[Tcp_tw_recycle and tcp_timestamps]
Refer to the official documentation (http://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt), tcp_tw_recycle explains as follows:
The tcp_tw_recycle option is used to Enable fast recycling TIME-WAIT sockets. Default value is 0.
The tcp_timestamps option is used to Enable timestamps as defined in RFC1323. Default value is 1.
These two options are the control options provided by the Linux kernel. They have nothing to do with specific applications, and a large amount of related information can be found online, but the information is incomplete, the main issues are as follows;
1) How fast is fast recovery?
2) some documents say that you only need to open tcp_tw_recycle, and some say that you want tcp_timestamps to open at the same time. Which of the following statements is correct?
3) Why is the option invalid when I initiate a client connection from the virtual machine NAT?
To answer the above questions, you can only read the code and see some relevant code for your reference:
===== Linux-2.6.37 net/ipv4/tcp_minisocks.c 269 ======
Void tcp_time_wait (struct sock * sk, int state, int timeo)
{
Struct inet_timewait_sock * tw = NULL;
Const struct inet_connection_sock * icsk = inet_csk (sk );
Const struct tcp_sock * tp = tcp_sk (sk );
Int recycle_ OK = 0;
// Determine whether to recycle quickly. here we can see that tcp_tw_recycle and tcp_timestamps can be quickly recycled only when both options are enabled,
// Further judgment conditions will be analyzed later. The further judgment conditions are related to the third question.
If (tcp_death_row.sysctl_tw_recycle & TP-> rx_opt.ts_recent_stamp)
Recycle_ OK = icsk-> icsk_af_ops-> remember_stamp (SK );
If (tcp_death_row.tw_count <tcp_death_row.sysctl_max_tw_buckets)
Tw = inet_twsk_alloc (SK, State );
If (TW! = NULL ){
Struct tcp_timewait_sock * tcptw = tcp_twsk (struct sock *) Tw );
// Calculate the time for fast recovery, which is equal to RTO * 3.5. The key to answering the first question is the approximate amount of RTO (Retransmission timeout ).
Const int RTO = (icsk-> icsk_rto <2)-(icsk-> icsk_rto> 1 );
//...... A lot of code is omitted here ......
If (recycle_ OK ){
// Set the time for fast recovery
Tw-> tw_timeout = rto;
} Else {
Tw-> tw_timeout = TCP_TIMEWAIT_LEN;
If (state = TCP_TIME_WAIT)
Timeo = TCP_TIMEWAIT_LEN;
}
//...... A lot of code is omitted here ......
}
RFC has detailed provisions on RTO calculation, a total of three: RFC-793, RFC-2988, RFC-6298, Linux implementation is reference RFC-2988.
If you are interested in these algorithm rules and Linuxde implementation, you can study them in depth. In actual application, we only need to remember the following two Linux boundary values:
===== Linux-2.6.37 net/ipv4/tcp. c 126 ========================
# Define TCP_RTO_MAX (unsigned) (120 * HZ ))
# Define TCP_RTO_MIN (unsigned) (HZ/5 ))
========================================================== =
The HZ here is 1 s, so it can be concluded that the maximum RTO is 120 s and the minimum is 200 ms. for LAN machines, RTO is usually 200 ms, therefore, 3.5 RTO is 700 ms.
That is to say, the rapid recovery is that the TIME_WAIT status lasts 700 ms, rather than the normal 2MSL (Linux is 1 minute, refer to: include/net/tcp. h 109 line TCP_TIMEWAIT_LEN definition ).
The test results also verify this inference. You can view connections in the TIME_WAIT status without interruption, and occasionally see one connection.
The last question is why even if tcp_tw_recycle and tcp_timestamps are set for a connection initiated from a virtual machine, they will not be quickly recycled. Continue to read the code:
The code line in the tcp_time_wait function: recycle_ OK = icsk-> icsk_af_ops-> remember_stamp (sk); The implementation is as follows:
===== Linux-2.6.37 net/ipv4/tcp_ipv4.c 1772 ====
Int tcp_v4_remember_stamp (struct sock * sk)
{
//...... A lot of code is omitted here ......
// When obtaining peer information, perform quick recovery; otherwise, do not perform quick recovery.
If (peer ){
If (s32) (peer-> tcp_ts-tp-> rx_opt.ts_recent) <= 0 |
(U32) get_seconds ()-peer-> tcp_ts_stamp> TCP_PAWS_MSL &&
Peer-> tcp_ts_stamp <= (u32) tp-> rx_opt.ts_recent_stamp )){
Peer-> tcp_ts_stamp = (u32) tp-> rx_opt.ts_recent_stamp;
Peer-> tcp_ts = TP-> rx_opt.ts_recent;
}
If (release_it)
Inet_putpeer (PEER );
Return 1;
}
Return 0;
}
The above code should be the reason why the virtual machine environment will not be released during the test. When the virtual machine is used for Nat, the server cannot obtain the information of the machine hidden in the nat.
The production environment also has the option, but the number of time_wait connections reaches more than 4 W, which may be related to virtual machines or networking.
Summary:
1) How fast is fast recovery?
In a LAN environment, Ms is recycled;
2) some documents say that you only need to open tcp_tw_recycle, and some say that you want tcp_timestamps to open at the same time. Which of the following statements is correct?
It needs to be enabled at the same time, but tcp_timestamps is enabled by default, so some people will say that you only need to enable tcp_tw_recycle;
3) Why is the option invalid when a client connection is initiated from a virtual machine?
It is related to network networking and will not be quickly recycled if peer information cannot be obtained;
Based on the above analysis and summary, we can see that this method is not very safe. In actual applications, it may be affected by virtual machines, network networking, firewalls, and so on, and thus cannot be quickly recycled.
Appendix:
1) For details about tcp_timestamps, see rf1323, which is related to congestion control.
2) open this option, may cause connection failure, see: http://www.pagefault.info /? P = 416