How long can the fin_wait1 last? Do you know

Source: Internet
Author: User

How long can the fin_wait1 last? Do you know2016-01-12 operation and maintenance help

Original: http://blogread.cn/it/article/7215?f=wb&luicode=10000359

Fire Ding Notes

A few days ago, a bunch of people in the tcpcopy community https://github.com/session-replay-tools/tcpcopy eggs, someone asked a question: How long can fin_wait1 last? Sparked a discussion during which I got a lot of benefit from @wangbin579 and many friends.

Let's warm up and recall the situation when TCP closes a connection by using an old diagram:

TCP Close

Look at the figure shows that the active closed side of the fin, while entering the FIN_WAIT1 state, the passive closed side of the response ack, so that the active shutdown of the party moved to the Fin_wait2 state, then the passive closed side will also issue fin, the active closed side of the response ack, while migrating to Time_wait status.

Back to the beginning of the question: how long can fin_wait1 last? In general, the ACK acknowledgement between servers is very fast, so that we can not see the existence of fin_wait1 with the naked eye, but there are many cases on the Internet that show that in some cases fin_wait1 will persist for a long time, thus inducing problems.

The most common misconception is that tcp_fin_timeout controls the expiration of Fin_wait1, which looks much like the name, but in fact it controls the expiration time of the FIN_WAIT2, the official document (https://www.kernel.org/doc/ Documentation/networking/ip-sysctl.txt) is said:

The length of time an orphaned (no longer referenced by any application) connection would remain in the fin_wait_2 state being Fore it is aborted at the local end. While a perfectly valid ' receive only ' state for the un-orphaned connection, an orphaned connection in fin_wait_2 State cou LD Otherwise wait forever for the remote-close its end of the connection.
Cf. Tcp_max_orphans
Default:60 seconds

Let's use an experiment to illustrate the problem (server: 10.16.15.107; Client: 10.16.15.109):

    1. Listening on the server 1234 port: "nc-l 1234"

    2. On Client Connection server: "NC 10.16.15.107 1234"
      At this point the client connection enters the established state

    3. Intercept response on server: "iptables-a output-d 10.16.15.109-j drop"

    4. On the client to open the grab bag: "tcpdump-nn-i any port 1234"

    5. Disconnect on client via "ctrl + c"
      At this point the client connection enters the FIN_WAIT1 state

Readily available through "netstat-ant | grep:1234"to observe the state, the final capture result is as follows:

TCP Fin

The first FIN was triggered when we disconnected by "ctrl + c" because we intercepted the response sent to the client through iptables on the server, so the corresponding ACK was discarded and several retries were performed.

In addition, by observing the time we can also find that the first retry at about 200MS, the second is around 400ms, the third time is around 800ms, and so on, each time doubled.

In fact, the key parameter to control this behavior is tcp_orphan_retries, Official document (Https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt) That's what it says:

This value influences the timeout of a locally closed TCP connection, when RTO retransmissions remain unacknowledged. See Tcp_retries2 for more details.
The default value is 8. If your machine was a loaded WEB server, you should think on lowering this value, such sockets may consume significant r Esources. Cf. Tcp_max_orphans.

If you use SYSCTL query tcp_orphan_retries is 0, then the actual equivalent of 8, look at the code:

So we can conclude that if your system is heavily loaded and has a lot of fin_wait1, you can consider reducing tcp_orphan_retries to solve the problem, depending on how much the network condition depends.

Problem analysis to the original can be a perfect curtain, but because of TCP is defective, resulting in fin_wait1 may be used to launch a Dos attack, so we will again 10 dollars, to see what is going on:

Assuming there is a large file on the server, the attacker connects to the server to initiate the request, but does not receive the data, thus creating a phenomenon: The client receive queue is full, causing the server to cycle through the "zero window probes" to detect if the client has free space, so that tcp_ Orphan_retries also no use, because the service end alive is suppressed dead, can't send FIN, thus forever stuck in fin_wait1. The demo code is as follows:

Description: Usually the file size is 100K, depending on the size of the TCP_RMEM/TCP_WMEM.

What to do? Sick Touyi, restart service! Unfortunately useless, because fin_wait1 has been out of the jurisdiction of the service, so restart the service is no use, if you must restart, you can only restart the server!

Fortunately, the kernel has already considered such a problem, it provides the Tcp_max_orphans parameter, to control the maximum value of orphans, it should be noted, and to control the time_wait the maximum value of the Tcp_max_tw_buckets parameter, unless you encounter a DoS attack, otherwise it's best not to lower it.

Trivia: I have tried to find some tools to kill fin_wait1 connection, if you want to kill a TCP connection, you need to know the corresponding ACK and SEQ, then you can RESET the connection. In order to obtain ACK and SEQ, some tools use a passive mechanism, which is to listen to matching packets to obtain the required data, the Representative is Tcpkill, and other tools use the proactive mechanism, it is to obtain the required data through the forgery request, the representative is KILLCX, if interested, may wish to try them.

Finally, thanks again to the Tcpcopy community! If you are from this literature to a little knowledge, then this honor belongs to the Tcpcopy community, if you find the fallacy in this article, then all because I am clumsy, but also hope to enlighten.

How long can the fin_wait1 last? Do you know

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.