TIME_WAIT caused cannot assign requested address error

Source: Internet
Author: User
Tags ack redis version time interval redis cluster redis server

1. Description of the problem

Sometimes use Redis client (PHP or Java client) to connect Redis server, error: "Cannot assign requested address." ”

The reason is that clients frequently connect to the server, because each connection in a very short time end, resulting in a lot of time_wait. So the new connection has no way to bind the port, that is, "Cannot assign requested address".

We can go through Netstat-nat | grep 127.0.0.1:6380 View the status of the connection 127.0.0.1:6380. You'll find a lot of time_wait.

Many people think of modifying kernel parameters to solve:

Execute the command to modify the following 2 kernel parameters
Sysctl-w Net.ipv4.tcp_timestamps=1 turns on support for TCP timestamps, if the key is set to 0, one of the following settings does not work
Sysctl-w net.ipv4.tcp_tw_recycle=1 indicates fast recovery of time-wait sockets in TCP connection

In fact, there is no understanding of the nature of the problem. First we understand the mechanism of Redis processing client connections and TCP time_wait.

2. Redis process Client connection mechanism (reference: http://redis.io/topics/clients) 1, Establish a connection (TCP connection):

Redis receives a connection from the client by listening to a TCP port or a Unix socket, and when a connection is established, Redis does the following: First, the client socket is set to Non-blocking because Redis is at the network event      A non-blocking multiplexing model is used in the rationale. Then set the Tcp_nodelay property for this socket, disable the Nagle algorithm, and then create a readable file event to listen for data sending on this client socket

When the client connection is initialized, Redis will see the current number of connections, then compare the configured maxclients value, if the current number of connections has reached the maximum number of connections maxclients, then the connection can not be received, Redis will return directly to the client a connection error, And immediately shut off the connection. 2, the server processing order

If multiple client connections are Redis and all send commands to Redis, then the Redis server will process which client's request first. The answer is actually not sure, mainly related to two factors, one is the corresponding number of sockets corresponding to the size of the client, the second is the Kernal report the sequence of the client events.

Redis process data from a client is as follows: It calls a read () once for the socket that triggers the event, and reads only once (instead of reading the message on the socket) to prevent the other client from continuing to send too many commands for a long time Not to deal with the situation. Of course, when the read () call is complete, it is executed in one order, regardless of how many commands it contains. This guarantees a fair treatment of the individual client commands.
3, about the maximum number of connections maxclients

In Redis2.4, the maximum number of connections is directly hard-coded into the code, and in version 2.6 this value becomes configurable. The default value for MaxClients is 10000, and you can also modify this value in redis.conf.

Of course, this value is only redis wishful thinking, and Redis also takes care of the system itself limits on the number of file descriptors used by the process. At startup, Redis checks the system's soft limit to see the maximum number of open file descriptors. If the system sets a number that is less than the maximum number of connections we want plus 32, then this maxclients setting will not work and Redis will set this value as the system requires. (plus 32 is because the Redis will use up to 32 file descriptors, so the connection can use the equivalent of all available descriptors minus 32).

When this happens when this occurs (MaxClients settings do not work), the Redis boot process will have a corresponding log record. For example, the following command wants to set the maximum number of clients to 100000, so Redis needs to 100000+32 a file descriptor, and the system's maximum file descriptor is set to 10144, so Redis can only set maxclients to 10144–32 = 10112.

$./redis-server--maxclients 100000
[41422] 11:28:33.179 # Unable to set the max number of files limit to 100032 (Invalid argument), setting the Max CL Ients configuration to 10112.

So when you want to set the MaxClients value, it is best to modify your system settings, of course, to develop a good habit of reading the log can also find this problem.

The specific setting method depends on your personal needs, you can only modify the limit of this session, you can also directly through the SYSCTL modify the system's default settings. Such as:

ULIMIT-SN 100000 # This'll only work if hard limit are big enough.
Sysctl-w fs.file-max=100000

4, Output buffer size limit

For Redis output (that is, the return value of the command), its size is often uncontrollable, and may be a simple command that can produce a large volume of return data. There is also the possibility that because of the execution of too many commands, the rate of return data generated exceeds the rate sent to the client, this will also generate message accumulation, resulting in an increasing output buffer, consuming too much memory, and even causing the system to crash.

So Redis has set up some protection mechanisms to avoid this happening, these mechanisms work on different types of clients, there are different output buffer size limits, there are two ways to restrict: one is the size limit, when a client's buffer exceeds a certain large hour, directly shut down this client connection another is to close the client connection directly when a client's buffer continues to occupy too much space for a period of time

The policy for different clients is as follows: for ordinary clients, the limit is 0, which is not limited, because ordinary clients usually use a blocking message response mode, such as: Send a request, wait for return, resend request, wait for return.        This pattern usually does not cause the accumulation of output buffers to swell. For pub/sub clients, the size limit is 32m, and when the output buffer exceeds 32m, the connection is closed.        The persistence limit is that when the client buffer size lasts more than 8m for 60 seconds, the connection is closed. For Slave clients, the size limit is 256m, and the persistence limit is to close the connection when the client buffer size lasts 60 seconds over 64m.

All three of the above rules are configurable. You can configure it by using the CONFIG SET command or by modifying the redis.conf file.

5. Input buffer size limit

Redis restrictions on the size of the input buffer are more violent, and when the client transmits a request that is larger than 1G, the server closes the connection directly. This approach can effectively prevent some client or server-side bugs from causing excessive input buffer problems.

6, Client timeout

For the current Redis version, the server defaults to not shut down long idle clients. But you can modify the default configuration to set the timeout you want. For example, if the client has no interaction for more than a long time, close it directly. Similarly, this can be configured by using the CONFIG SET command or by modifying the redis.conf file.

It's worth noting that the timeout setting only works for ordinary clients, and that the long-term idle state is normal for pub/sub clients.

In addition, the actual timeout may not be as accurate as set, because Redis does not use timers or rotation traversal methods to detect client timeouts, but to do so in an asymptotic fashion, each part of the check. So the result is, maybe you set the timeout time is 10s, but the actual execution time is timeout 12s after the client is closed.

Expression

3. Time_wait State of TCP

The active closed socket end enters the time_wait state and lasts for 2MSL of time, the MSL is maximum segment Lifetime (maximum section Life), default 240 seconds under Windows, The MSL is the longest time an IP packet can survive on the Internet, and it will disappear in the network over that time. MSL recommends 2 minutes on RFC 1122, whereas TCP implementations originating from Berkeley traditionally use 30 seconds, and thus the TIME_WAIT state is generally maintained at 1-4 minutes.


reasons for the existence of the TIME_WAIT state:

1 Reliable implementation of TCP full duplex termination: (that is, under the time_wait wait for 2MSL, just to do their best to ensure that four times the handshake normal shutdown).

TCP protocol, for the established connection, the network will have to shake hands four times to successfully disconnect, if one of the missing steps, will make the connection in suspended animation state, the connection itself occupies resources will not be released.

In the closing connection of the four-way handshake protocol, the final ACK is issued by the active shutdown, if the final ACK is lost, the server will be the final fin, so the client must maintain state information allows it to repeat the final ack. If you do not maintain this state information, then the client will respond to the RST section, thus, to achieve the normal termination of TCP Full-duplex connection, must deal with the termination sequence of four sections of any section of the loss, the active shutdown of the client must maintain state information into the TIME_WAIT state.

We see the client actively shut down the server passively shut down the four handshake process:

1, the client sent fin message segment, into the fin_wait_1 state.

2, the server received fin message segment, send ACK acknowledgement, into the close_wait state.

3, the client received the Fin confirmation message segment, into the Fin_wait_2 state.

4, the server side sends the FIN report Wen Duan, enters the Last_ack state.

5, the client received fin report Wen Duan, send fin ack, at the same time into the TIME_WAIT state, start the time_wait timer, timeout time set to 2MSL.

6, the server received Fin ack, into the closed state.

7, the client in 2MSL time confiscated to the end of any response, time_wait timeout, into the closed state.

If you do not consider the message delay, loss, confirm the delay, loss and so on, time_wait does not need to exist. When the network is not ideal, there will usually be a delay in packet loss, let's look at one of the following special cases:

The client enters sends the last ACK which receives four handshake closes, enters the time_wait simultaneously sends the ACK, if it does not stay 2MSL time, but immediately closes the connection, destroys the connection resources, when sends the following situation, will not be able to complete four times the normal handshake closes:

The ACK sent by the client is lost on the network, so that the server can not receive the final ACK, the retransmission timer timeout, will retransmit fin to the client, because the client about the connection all the resources are released, received the retransmission fin, it does not have any information about this fin, So send a RST report to the server side Wen Duan, the server side received the RST, think the connection has an exception (rather than normal shutdown).

Therefore, in the TIME_WAIT state wait for the 2MSL time, is to be able to properly handle the first ACK (the longest lifetime for MSL) lost, you can receive the side of the fin (the longest time for MSL), and then retransmission ack.

Whether as long as the active shut down in the state of time_wait to stay 2MSL, four times handshake must be completed normally.

The answer is in the negative. You can consider the following situations,

Time_wait state sent Ack lost, last_ack time set retransmission timer timeout, send retransmission fin, unfortunately, this fin is also lost, active shutdown in the TIME_WAIT state wait 2MSL did not receive any message segment, into the closed state, The passive closing side does not receive the final ACK at this time. So even if you want to actively shut down in the state of time_wait to stay 2MSL, it does not necessarily mean that four times handshake must be completed normally.
2 Ensure that the old message segment disappears in the network without affecting the newly established connection

Consider the following situation, the active shutdown in the TIME_WAIT state of the ACK sent by the network delay due to the reason not to the end (but did not exceed MSL time), resulting in the passive closure of the fin, after the fin retransmission, the delay of the ACK arrived, the passive closed side into the closed state, If the active shutdown party sends an ACK in the TIME_WAIT state immediately after entering the closed state (that is, no wait) 2MSL time, the above connection no longer exists:

Now consider the following scenario, assuming the TCP connection of the client (192.186.0.1:23) to the server 192.168.1.1:6380, because the connection is closed, we can immediately establish a TCP connection between the same IP address and the port. And this connection is also the client (192.186.0.1:23) to the server 192.168.1.1:6380), then when the retransmission fin of the previous connection arrives at the active shutdown, it is accepted by the new connection, which causes the new connection to be reset, which is obviously not what we want to see.

New connections to be established must be possible after both the active shutdown and the passive shutdown are entered into the closed state. Therefore, the situation that most likely causes the old message segment to affect the new connection is:

Before the TIME_WAIT state, the active shutdown sent the newspaper Wen Duan delay in the network, but Time_wait set to 2MSL, these reports Wen Duan will inevitably disappear in the network (maximum lifetime for MSL). Passive shutdown is most likely to affect the new connection of the message segment is the case we discussed above, the other ACK delay arrived, before the retransmission fin, this newspaper Wen Duan sent, time_wait timer timeout time is definitely greater than the MSL, in 1MSL time, This fin either disappears in the network because of the generation time, or arrives at the active shutdown to be handled without affecting the newly established connection.


The new SCTP protocol avoids the TIME_WAIT state by adding a validation flag to the message header.

3 optimization of kernel-level keepalive and Time_wait

Optimized tuning of kernel-level keepalive and time_wait
Vi/etc/sysctl
Net.ipv4.tcp_tw_reuse = 1
Net.ipv4.tcp_tw_recycle = 1
Net.ipv4.tcp_keepalive_time = 1800
Net.ipv4.tcp_fin_timeout = 30
Net.core.netdev_max_backlog =8096

Modify the finished use sysctl-p let it take effect
Annotations of the above parameters
/proc/sys/net/ipv4/tcp_tw_reuse
This file indicates whether the time-wait-state socket is allowed to reapply for a new TCP connection.

/proc/sys/net/ipv4/tcp_tw_recycle
Recyse is an accelerated time-wait sockets recovery

Changes to Tcp_tw_reuse and tcp_tw_recycle may occur. Warning, got duplicate TCP line warning, got BOGUS TCP Line. The above two parameters refer to the existence of these two identical TCP connections, which occur when a connection is quickly disconnected and reconnected, and the port and address used are the same. But basically such things will not happen, anyway, so that the above settings will increase the chance of recurrence. This tip will not be compromised and will not degrade system performance and is currently working

/proc/sys/net/ipv4/tcp_keepalive_time
Indicates how often TCP sends KeepAlive messages when KeepAlive is enabled. The default is 2 hours.

The/proc/sys/net/ipv4/tcp_fin_timeout best value is 30 the same as BSD.
The FIN_WAIT1 state is when the initiator actively requests the shutdown of the TCP connection, and after the active sending of the fin, waits for the receiving end to reply to the ACK. For a socket connection that is disconnected from the end, TCP remains in the Fin-wait-2 state for a time. The other side may disconnect or never end the connection or the unexpected process dies.

/proc/sys/net/core/netdev_max_backlog
This file specifies the maximum number of packets that are allowed to be sent to the queue when the interface receives packets at a faster rate than the kernel processes those packets

4) Optimization of time_wait processing


TCP is connection-oriented in Linux systems, and in practical applications it is often necessary to detect whether a connection is still available. If not available, it can be divided into: a. The connection's end-to-end shutdown is normal. B. The connection to the end of the abnormal shutdown, which includes power off the end of the device, the program crashes, the network is interrupted. This situation is not and cannot be notified to the end, so the connection will always exist, wasting the country's resources. The TCP protocol stack has a keepalive attribute that can proactively detect whether the socket is available, but the default value for this property is large. Global settings can change/etc/sysctl.conf, plus: NET.IPV4.TCP_KEEPALIVE_INTVL = 20
Net.ipv4.tcp_keepalive_probes = 3
Net.ipv4.tcp_keepalive_time = 60
Set the following in the program: int keepAlive = 1; Turn on keepalive properties
int keepidle = 60; If the connection does not have any data transactions within 60 seconds, probe
int keepinterval = 5; The time interval for the contract is 5 seconds when probing
int keepcount = 3; The number of probe attempts. If the 1th probe packet received a response, then 2 times after the no longer sent. SetSockOpt (RS, Sol_socket, so_keepalive, (void *) &keepalive, sizeof (KEEPALIVE));
SetSockOpt (RS, Sol_tcp, Tcp_keepidle, (void*) &keepidle, sizeof (Keepidle));
SetSockOpt (RS, Sol_tcp, TCP_KEEPINTVL, (void *) &keepinterval, sizeof (Keepinterval));

SetSockOpt (RS, Sol_tcp, tcp_keepcnt, (void *) &keepcount, sizeof (Keepcount));


4. Solve the problem we understand Redis processing Client connection mechanism and TCP time_wait. We can reproduce the above problem, we quickly set up 2000 connections,


<?php
$num =;
for ($i =0; $i < $num; $i + +) {
	$redis = new Redis ();
	$redis->connect (' 127.0.0.1 ', 6379);
	Sleep (1);
}
Sleep (10);
Then View status: Netstat-nat | grep 127.0.0.1:6379 You will find many time_wait.

If $num increase to 40000 or, error: cannot assign requested address.

So if this problem occurs on the client (PHP) connection Redis, you are having a bug in your program. You instantiate Redis in a loop (that is, new redis each time), creating a connection for Each loop.

The solution to this problem is not to modify kernel parameters, but to encapsulate the connection Redis into a single instance, ensuring that the connection Redis is the only instance within the same process.

Class Class_redis {

	private $_redis;
	private static $_instance = null;
	
	Private  function __construct () {
		$this->_redis = new Redis ();
		$this->_redis->connect (' 127.0.0.1 ', 6379);

	}
	
	public static function getinstance () {
		if (self::$_instance = = null) {
			self::$_instance = new self ();
		} return
		self::$_instance;
	
	}

	
	Public  function Getredis () {return
		$this->_redis;
	}

}



5. Understand Redis performance related indicators

The data we output through the info command can be divided into 10 categories: information about the server clients memory Persistence:rdb and AOF stats: general statistics replication: Master/detail replication Information cpu:cpu meter Statistic Information Commandstats:redis command statistic information Cluster:redis cluster information keyspace: According to database related statistics

The info command can add parameters to get the data under a single category. For example, enter the info memory command to return only memory-related data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.