The previous note mainly introduced the basic knowledge related to time_wait. This article describes how to solve the problem raised by the article title based on practice.
1. view the system network configuration and current TCP status
It is necessary to understand the default network configuration of the system when locating and handling network problems in applications. Take the x86_64 Linux kernelversion 2.6.9 machine as an example. The default IPv4 network protocol configuration can be viewed under/proc/sys/NET/IPv4, the configuration items related to the TCP protocol stack are named after tcp_xxx. For the meaning of these configuration items, refer to the documentation here, you can also view the official documentation (src/Linux/documentation/ip-sysctl.txt) provided in the Linux source code tree ). The following lists several key configuration items and their default values on my machine:
cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000cat /proc/sys/net/ipv4/tcp_max_syn_backlog 1024cat /proc/sys/net/ipv4/tcp_syn_retries 5cat /proc/sys/net/ipv4/tcp_max_tw_buckets 180000cat /proc/sys/net/ipv4/tcp_tw_recycle 0cat /proc/sys/net/ipv4/tcp_tw_reuse 0
The first three items respectively indicate the distribution range of the local port (the default number of available ports is less than 3 W), the maximum length of the incomplete connection queue, and the maximum number of SYN retries during the three handshakes, the meaning of these three configurations can be defined. The meaning of the last three configurations needs to be understood, because they are used in the process of locating and solving the problem.
1) tcp_max_tw_buckets
This document
Maximal number of time Wait sockets held by system simultaneously. if this number is exceeded time_wait socket is immediately destroyed and warning is printed. this limit exists only to prevent simple DoS attacks, you must not lower the limit artificially, but rather increase it (probably, after increasing installed memory ), if network conditions require more than default value (180000 ).
This configuration item is used to prevent simple DoS attacks.
In some cases, you can increase the value as appropriate, but it should never be smaller. Otherwise, you will be at your own risk...
2) tcp_tw_recycle
Enable fast reconfiguring of sockets in time-Wait status. The defaultvalue is 0 (Disabled). It shocould not be changed without advice/request of technical experts.
This configuration item can be used to quickly recycle sockets in the time_wait state for reallocation. The configuration is disabled by default and can be enabled if necessary. However, after you enable this configuration item, you need to pay attention to it.
3) tcp_tw_reuse
Allow to reuse time-Wait Sockets for new connections when it is safe from Protocol viewpoint. The default value is 0. It shocould not
Be changed without advice/request of technical experts.
After this option is enabled, the kernel will reuse the socket in the time_wait state. Of course, the premise of the reuse is that "from the Protocol perspective, reuse is safe ". About"
Under what circumstances does the Protocol think reuse is safe?
"This question, this article
The answer is extracted from the Linux kernel source code. You can view the answer if you are interested.
2. Network problem locating ideas
Refer to the online problems described at the beginning of the previous note. When you receive an alarm that a machine cannot create a new connection, troubleshoot the problem as follows:
According to statistics on netstat-at | grep "time_wait", a total of 10 million + TCP connections in time_wait status were found on the faulty machine. Further analysis showed that, the time_wait connection caused by the alarm module is more than 2 W. Relocate the statistical results output by netstat to the file for further analysis. Generally, the local port is heavily occupied.
According to the system configuration items described earlier in this article, the default value of tcp_max_tw_buckets is 18 W, and the range of ip_local_port_range is less than 3 W. A large number of time_wait States make the local port unavailable during the time_wait duration,No availableLocal port, which is the biggest cause of connection failure..
Here, we would like to remind you that:The above conclusion is just our preliminary judgment. The specific cause also needs to be further confirmed based on the abnormal return values of the Code (such as socket API return values and errno) and module logs. The reason for the failure to establish a new connection may also be that it has been blacklisted by other modules. I have learned this lesson: the program failed to request downstream modules using the libcurl API. It was initially found that there were many machine time_wait statuses, therefore, without carefully analyzing the curl output log, we thought it was a problem caused by time_wait, which wasted a lot of time. After half a day, we remembered that the downstream module had an attack prevention mechanism, the IP address of the machine initiating the request is not included in the access whitelist of the downstream module. During the peak period, the number of downstream requests sent by the upstream module through curl is too frequently blacklisted, when the connection is created, the TCP layer of the downstream module directly disconnects the connection with the RST package, resulting in the "Recv failure: Connection reset by peer" error returned by the curl API. Painful lessons = _ =
In addition, section 4.3 of UNIX Network Programming volume 1 describes when to send the RST package:
An RST is a type of TCP segment that is sent by TCP when somethingis wrong.Three conditions that generatean RST are:
1) When a SYN arrives for a port that has no listening server;
2) When TCP wants to abort an existing connection;
3) when TCP ipves a segment for a connection that does not exist. (tcpv1 [pp.246-250] contains additional information .)
3. Solution
Two methods can be used to solve the problem that too many time_wait machines cannot create new TCP connections.
3.1 modify system configuration
Specifically, you need to modify the three configuration items tcp_max_tw_buckets, tcp_tw_recycle, and tcp_tw_reuse described earlier in this article.
1) Increase tcp_max_tw_buckets. The default value of tcp_max_tw_buckets in the first part of this article is 18 W (different kernels may vary depending on the actual configuration of the machine). Based on this document, we can increase the value as appropriate, I am not sure about the limit. I personally think that this method can only alleviate too many problems of time_wait. As the access pressure continues, the problem will sooner or later, and the root cause will not be solved.
2) Enable tcp_tw_recycle: Enter the command "Echo 1>/proc/sys/NET/IPv4/tcp_tw_recycle" on the shell terminal to enable this configuration.
It should be clear that:In fact, whether the time_wait status socket is quickly recycled is determined by the two configuration items tcp_tw_recycle and tcp_timestamps. However, because tcp_timestamps is enabled by default, most articles only mention setting tcp_tw_recycle to 1. For more details (Analysis of the kernel source code), see this article.
Note the following:If you enable the tcp_tw_recycle option between the client and the server, the server may drop (directly send RST) The Syn packet from the client. For detailed case and cause analysis, refer to the analysis here, here or here and here. This article will not go into detail.
3) Enable the tcp_tw_reuse option: echo1>/proc/sys/NET/IPv4/tcp_tw_reuse. This option also works with tcp_timestamps, and socket reuse is also conditional. For details, see this article. I checked a lot of information and found no network problems caused by tcp_tw_recycle in the network environment where Nat or firewall is used.
3.2 modify an application
Specifically, there are two methods:
1) Change the TCP short connection to a long connection. In general, if the target to initiate a connection is a self-controllable server, it is best for them to use persistent connections for their own TCP communication, avoid various overhead caused by the establishment/release of a large number of TCP short connections. If the connection is established on a machine that is not controlled by itself, if you can use persistent connections, you need to consider whether the other machine supports persistent connections.
2) use getsockopt/setsockoptapi to set the so_linger option of the socket. For details about how to set the so_linger option, refer to section 7.5 of UNP volume1, for more information, see this document.
4. Issues to be explained
We can say that too many time_wait connections may not be able to create new external connections. In fact, there is an exception but a common situation: the S module is deployed on the server as a webserver and bound to a local port; the client and S are short connections, and s actively disconnects each interaction. In this way, when the number of concurrent accesses to the client is very high, the machine where the S module is located may have a large number of TCP connections in the time_wait status. However, because the server module is bound with a port, in this case, it does not cause the problem of "too many time_wait Connections cannot be established. That is to say, the situation discussed in this article usually appears only on the machine where a random port is allocated by the operating system to run the program (each time a random port is allocated, resulting in no port Available at the end ).
[References]
1. tcpvariables of ipsysctl-tutorial
2. proc_sys_net_ipv4
3. Linux + v3.2.8/documentation/networking/ip-sysctl.txt
4. series of articles-solutions to TCP short connection time_wait Problems 1-5
5. Open a problem caused by tcp_tw_recycle.
6. Dropping of connections with tcp_tw_recycle = 1
7. syn_ack problems caused by tcp_tw_recycle and Nat
======================== EOF ==================================