Excessive TCP close_wait Solution

Source: Internet
Author: User

First, "Most of the reasons for the procedure"? This is still for the program ape.

Second, Linux under close_wait too much solution

Scenario Description: System generates a lot of "Too many open files"

Cause analysis: In the process of communication between server and client, the closed_wait occurred due to the server's socket opening, the number of handles opened by the listening port was 1024, and all were in the close_wait state, which eventually caused the configuration port to be full. Too Many open files ", no further communication is possible.

Close_wait status occurs because the passive shut-down side does not shut down the socket causing

Solution: Two measures are feasible

First, solve:

The reason is because the thread blocking is caused by calling the accept () method of the ServerSocket class and the Read () method of the socket input stream, so the timeout should be set with the Setsotimeout () method (the default setting is 0, which means that the timeout will never occur) The time-out judgment is cumulative, and once set, the blocking time caused by each call is deducted from the value until another time-out setting or a time-out exception is thrown.

For example, if a service requires three calls to read () and a timeout of 1 minutes, then an exception is thrown if the total time of the three read () call for a service exceeds 1 minutes, and if the service is repeated on the same socket, a timeout will be set before each service.

Second, avoid:

Adjust system parameters, including handle parameters and TCP/IP parameters;

Attention:

/proc/sys/fs/file-max is the limit of the number of files that can be opened by the whole system, controlled by sysctl.conf;

Ulimit modifies the limit of the number of files that the current shell and its child processes can open, controlled by limits.conf;

Lsof is a list of resources used by the system, but these resources do not necessarily occupy open file numbers, such as: Shared memory, semaphore, message queue, memory mapping, etc., although the use of these resources, but do not occupy the open file number;

Therefore, the need to adjust is the current user's child process open the limit of the number of files, that is, the configuration of the limits.conf file;

If the Cat/proc/sys/fs/file-max value is 65536 or even greater, you do not need to modify the value;

If the value of the open files parameter is less than 4096 (the default is 1024), modify the open files parameter value to 8192 by using the following method: Ulimit-a

1. Log in with root, modify the file/etc/security/limits.conf

Vim/etc/security/limits.conf

Add to

Xxx-nofile 8192

XXX is a user, if you want all users to take effect, change to *, set the value of the hardware configuration, do not set too large.

#<domain> <type> <item> <value>* Soft nofile 8192 * Hard Nofi Le 8192

#所有的用户每个进程可以使用8192个文件描述符.

2. Making these restrictions effective

Make sure the files/etc/pam.d/login and/etc/pam.d/sshd contain the following lines:

Session Required Pam_limits.so

The user can then log on again to take effect.

3. Ulimit-a can be used under bash to see if it has been modified:

First, modify the method: (temporarily effective, after restarting the server, will revert to the default value)

Sysctl-w net.ipv4.tcp_keepalive_time=600 sysctl-w net.ipv4.tcp_keepalive_probes=2 sysctl-w net.ipv4.tcp_keepalive_ intvl=2

Note: Linux kernel parameter adjustment is reasonable to pay attention to observe, look at the peak of the business effect.

Second, if the changes can be done as above, then make the following changes for permanent effect.

Vi/etc/sysctl.conf

If the following information does not exist in the configuration file, add:

Net.ipv4.tcp_keepalive_time = 1800 Net.ipv4.tcp_keepalive_probes = 3 NET.IPV4.TCP_KEEPALIVE_INTVL = 15

After editing/etc/sysctl.conf, it will take effect to restart the network.

/etc/rc.d/init.d/network restart

Then, execute the sysctl command to make the change effective, basically even if it's done.

------------------------------------------------------------

Reason for modification:

When the client for some reason before the service side of the fin signal, it will cause the service end of the passive shutdown, if the server does not actively shut down the socket to send fin to the client, the server socket will be in the close_wait state (rather than the Last_ack state). Typically, a close_wait is maintained for at least 2 hours (the default time-out for the system is 7,200 seconds, or 2 hours). If a service-side program causes a heap of close_wait to consume resources for some reason, the system is usually broken when it is not ready to be released. Therefore, the solution to this problem can also be modified by modifying the parameters of TCP/IP to shorten this time, so modify the tcp_keepalive_* series parameters:

Tcp_keepalive_time:

/proc/sys/net/ipv4/tcp_keepalive_time

INTEGER, the default value is 7200 (2 hours)

The frequency at which TCP sends keepalive messages when KeepAlive is turned on. The recommended modification value is 1800 seconds.

Tcp_keepalive_probes:integer

/proc/sys/net/ipv4/tcp_keepalive_probes

INTEGER, the default value is 9

TCP sends a KeepAlive probe to determine the number of times that the connection has been disconnected. (Note: Keeping a connection is only sent if the so_keepalive socket option is turned on.) The number of times does not need to be modified by default, although this value can be shortened appropriately depending on the situation. Set to 5 more appropriate)

Tcp_keepalive_intvl:integer

/proc/sys/net/ipv4/tcp_keepalive_intvl

INTEGER, the default value is 75

When the probe is not confirmed, the frequency of the probe is resent. Probe how often the message is sent (how many TCP keepalive probe packets are sent before the connection is determined to fail). Multiply tcp_keepalive_probes to get the time to kill a connection that has not responded since the start of the probe. The default value is 75 seconds, which means that no active connections will be discarded after approximately 11 minutes. (For normal applications, this value is somewhat larger and can be changed as needed.) in particular, the Web Class Server needs to be smaller, 15 is a more appropriate value)

1. The system no longer appears "Too Many open files" error phenomenon.

2. The sockets in the TIME_WAIT state will not be long.

The following statements can be used on Linux to look at the TCP status of the server (number of connection states):

netstat-n| awk '/^tcp/{++s[$NF]} END {for (a in S) print A, s[a]} '


This article is from the "Smurf Linux ops" blog, so be sure to keep this source http://jin771998569.blog.51cto.com/2147853/1688253

Excessive TCP close_wait Solution

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.