Linux TCP System Parameters

Source: Internet
Author: User
Tags ack

performance tuning is only necessary when adjustments are needed, and adjustments are required to compare the data collected with benchmark data. you do not need to adjust these parameters blindly.

1.TCP keepalive TCP Connection freshness setting

Echo 1800 >/proc/sys/net/ipv4/tcp_keepalive_time
echo >/PROC/SYS/NET/IPV4/TCP_KEEPALIVE_INTVL
Echo 5 >/proc/sys/net/ipv4/tcp_keepalive_probes

KeepAlive is a TCP freshness timer. After a TCP connection is established on both sides of the network and idle idle (no traffic between the two parties) is tcp_keepalive_time, the server kernel attempts to send a reconnaissance packet to the client to determine the TCP connection condition (it is possible that the client crashes, the application is forced to shut down, Host unreachable, etc.). If you do not receive an answer (ACK packet), you will try to send the detection packet again after TCP_KEEPALIVE_INTVL, until you receive an ACK to the other side, if you have not received the other side of the ACK, will try to tcp_keepalive_probes times, Each time interval is 15s, 30s, 45s, 60s, 75s, respectively. If you try to tcp_keepalive_probes and you still do not receive an ACK packet from the other, the TCP connection is discarded.

2. SYN Cookie Settings

echo 0 >/proc/sys/net/ipv4/tcp_syncookies

In CentOS5.3, the default value of this option is 1, which is to enable the SYN cookies feature. We recommend that you close it until you are sure that the SYN-cookies feature will be turned on when the SYN flood attack is in place, effectively preventing SYN flood attacks. Syn Flood attacks can also be rejected through iptables rules.

3.TCP Connection Establishment Settings

echo 8192 >/proc/sys/net/ipv4/tcp_max_syn_backlog
Echo 2 >/proc/sys/net/ipv4/tcp_syn_retries
Echo 2 >/proc/sys/net/ipv4/tcp_synack_retries

Tcp_max_syn_backlog the length of the SYN queue, often referred to as a non-established connection queue. The system kernel maintains a queue that accommodates TCP connections with a status of Syn_resc (Half-open connection), which are TCP connection requests that still do not have a client acknowledgement (ACK). Increase this value to accommodate more network connections waiting to be connected.
Tcp_syn_retries a new TCP connection request, a SYN packet is sent that determines how many times the kernel needs to attempt to send a SYN connection request before it decides to abandon the connection. The default value is 5. For a highly responsible and well-communicated physical network, the adjustment is 2
Tcp_synack_retries for a remote SYN connection request, the kernel sends a SYN+ACK packet to confirm receipt of the last SYN connection request packet, and then waits for the remote acknowledgment (ACK packet). This value specifies that the kernel sends Tcp_synack_retires secondary SYN+ACK packets to the remote. The default setting value is 5, which can be adjusted to 2

4. TCP Connection Disconnect Related settings

echo >/proc/sys/net/ipv4/tcp_fin_timeout
echo 15000 >/proc/sys/net/ipv4/tcp_max_tw_buckets
Echo 1 >/proc/sys/net/ipv4/tcp_tw_reuse
Echo 1 >/proc/sys/net/ipv4/tcp_tw_recycle

Tcp_fin_timeout for a TCP connection that is actively disconnected by this side, the local side will actively send a fin datagram, the status of the TCP connection is fin_wait_2 state when the remote ACK is received, and the remote Fin packet is not received, when the application is closed at the remote end, Network unreachable (Unplug the network), program can not be broken zombie, etc., this side will keep the state of the fin_wait_2 state of the TCP connection, the value tcp_fin_timeout specifies the status of fin_wait_2 TCP connection for how long to save, a fin_wait_ 2 of TCP connections account for up to 1.5k of memory. The system default is 60 seconds and you can adjust this value to 30 seconds or even 10 seconds.
The tcp_max_tw_buckets system processes the number of time_wait sockets simultaneously. If the number of time_wait TCP connections exceeds this number, the system forces the purge and displays a warning message. This limitation is primarily to prevent simple Dos attacks, which may consume more memory resources. If the time_wait socket is too large, it is possible to run out of memory resources. The default value is 18w and you can set this value to 5000~30000
Tcp_tw_resue whether the Time_wait TCP connection can be used to establish a new TCP connection.
Tcp_tw_recycle the ability to turn on fast band reclaim time_wait TCP connections.

5. TCP memory resource using phase parameter setting

echo 16777216 >/proc/sys/net/core/rmem_max
echo 16777216 >/proc/sys/net/core/wmem_max
Cat/proc/sys/net/ipv4/tcp_mem
echo "4096 65536 16777216″>/proc/sys/net/ipv4/tcp_rmem
echo "4096 87380 16777216″>/proc/sys/net/ipv4/tcp_wmem

Rmem_max defines the maximum value that can be used by the receive window, which can be adjusted according to the BDP value.
Wmem_max defines the maximum value that can be used by the Send window, which can be adjusted based on the BDP value.
Tcp_mem [Low, pressure, high] TCP uses these three values to track memory usage to limit resource consumption. Typically, when the system is boot, the kernel calculates these values based on the total amount of available memory. If an out of socket memory appears, you can try modifying this parameter.
1) Low: TCP does not filter out memory when TCP uses a number of memory pages below this value.
2) Pressure: When TCP uses more memory pages than this value, TCP attempts to stabilize its memory footprint, enter pressure mode, and exit the mode until the memory consumption reaches the low value.
3) Hight: The number of memory pages that allow all TCP sockets to queue buffered datagrams.
Tcp_rmem [min, default, Max]
1) min reserved for each TCP connection (TCP socket) the amount of memory that is used to receive the buffer, even if the TCP socket has at least so much memory to receive buffering in the event of memory tension.
2) default is the amount of memory that the TCP socket reserves to receive buffering, which, by defaults, affects the value of Rmem_default used by other protocols, so it may be overwritten by Rmem_default.
3) Max This value is the maximum amount of memory that each TCP connection (TCP socket) receives for buffering. The value does not affect the value of Wmem_max, and the option parameter SO_SNDBUF is not affected by the value.
Tcp_wmem [min, default, Max] above (tcp_rmen) is just for sending the cache.


-----------------------------------a case----------------------------------------

Recently there have been some problems of connect failure, after analysis and test, the final confirmation and proc parameters tcp_tw_recycle/tcp_timestamps related;

1. Phenomena
The first phenomenon: module A through the NAT gateway to access the service s success, while Module B through the NAT gateway Access Service s recurrent connect failure, Packet discovery: The service S side has received a SYN packet, but did not reply synack; In addition, module a closed the TCP timestamp, and Module B opens the TCP timestamp;
The second phenomenon: module C on different hosts (turn on timestamp), access the same service s through the Nat Gateway (an egress IP), host C1 Connect succeeds, and host C2 connect fails;

2. Analysis
According to the phenomenon above the problem is obviously related to TCP timestmap; View the Linux 2.6.32 kernel source code, found that tcp_tw_recycle/tcp_timestamps are open under the conditions of the same source IP host socket in 60s The timestamp in the connect request must be incremented.
SOURCE function: Tcp_v4_conn_request (), the function is the TCP layer three-time Handshake SYN Packet processing function (server);
SOURCE snippet:
if (Tmp_opt.saw_tstamp &&
Tcp_death_row.sysctl_tw_recycle &&
(DST = Inet_csk_route_req (SK, req))! = NULL &&
(Peer = Rt_get_peer ((struct rtable *) DST))! = NULL &&
Peer->v4daddr = = saddr) {
if (Get_seconds () < Peer->tcp_ts_stamp + TCP_PAWS_MSL &&
(S32) (peer->tcp_ts-req->ts_recent) >
Tcp_paws_window) {
NET_INC_STATS_BH (Sock_net (SK), linux_mib_pawspassiverejected);
Goto Drop_and_release;
}
}
Tmp_opt.saw_tstamp: the socket supports Tcp_timestamp
Sysctl_tw_recycle: Tcp_tw_recycle option is enabled on the native system
tcp_paws_msl:60s, this condition determines that the last TCP communication for that source IP occurred within 60s
Tcp_paws_window:1, this condition determines that the last TCP communication of the source IP is timestamp greater than the current TCP

Analysis: Host CLIENT1 and CLIENT2 access Servern through a NAT gateway (1 IP addresses), CLIENT1 and Client2 are not the same because timestamp time is the system boot to the current time , according to the above SYN packet processing source code, in Tcp_tw_recycle and tcp_timestamps simultaneously open conditions, timestamp large host access Servern success, and Timestmap small host access failed;

Parameters:/proc/sys/net/ipv4/tcp_timestamps-control timestamp option on/off
/proc/sys/net/ipv4/tcp_tw_recycle-reduces the timeout period for timewait socket release

3. Workaround
echo 0 >/proc/sys/net/ipv4/tcp_tw_recycle;
Tcp_tw_recycle By default is off, there are many servers, in order to improve performance, the option is turned on;
To address these issues, it is recommended that you turn off the tcp_tw_recycle option instead of timestamp, because tcp_tw_recycle is not working if the TCP timestamp is turned off, and the TCP The timestamp can be opened and functioning independently.
SOURCE function: tcp_time_wait ()
SOURCE snippet:
if (tcp_death_row.sysctl_tw_recycle && tp->rx_opt.ts_recent_stamp)
RECYCLE_OK = Icsk->icsk_af_ops->remember_stamp (SK);
......

if (Timeo < RTO)
Timeo = RTO;

if (RECYCLE_OK) {
Tw->tw_timeout = RTO;
} else {
Tw->tw_timeout = Tcp_timewait_len;
if (state = = tcp_time_wait)
Timeo = Tcp_timewait_len;
}

Inet_twsk_schedule (TW, &tcp_death_row, Timeo,
Tcp_timewait_len);

Timestamp and tw_recycle at the same time, the timewait state socket release timeout time and RTO-related; otherwise, the time-out is Tcp_timewait_len, that is, 60s;

This parameter is described in the kernel description document as follows:
Tcp_tw_recycle-boolean
Enable Fast recycling time-wait sockets. Default value is 0.
It should not being changed without advice/request of technical
Experts.

Source: http://blog.sina.com.cn/s/blog_781b0c850100znjd.html

On Some highly concurrent webserver, for the port to be able to recover quickly, opened the net.ipv4.tcp_tw_recycle, and when the net.ipv4.tcp_tw_recycle off, kernal is not to check the timestamp of the packet of the end machine; When Tcp_tw_reccycle is turned on, the timestamp is checked, and unfortunately the time stamp of the packet sent by the Cmwap is skipped, so the server treats the packet with the "backwards" timestamp as "the retransmission data of the recycle TW connection, is not a new request ", so lost not to return the package, resulting in a large number of lost packets.

Linux TCP System Parameters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.