Linux kernel TCP/IP, socket parameter tuning

Source: Internet
Author: User

Doc1:
/proc/sys/net Directory
All TCP/IP parameters are located in the/proc/sys/net directory (note that modifications to the contents of the/proc/sys/net directory are temporary and any modifications will be lost after the system restarts).
  

Http://www.360doc.com/content/16/0715/13/25686888_575696702.shtml

/etc/sysctl.conf file

/etc/sysctl.conf is an interface that allows you to change a running Linux system. It contains advanced options for the TCP/IP stack and virtual memory system, which can be used to control the Linux network configuration, because the/proc/sys/net directory content is temporary, it is recommended to add the TCPIP parameter modification to the/etc/sysctl.conf file, and then save the file, Use the command "/sbin/sysctl–p" to make it effective immediately. Specific modifications to the scheme are referred to above:

Net.core.rmem_default = 256960
Net.core.rmem_max = 513920
Net.core.wmem_default = 256960
Net.core.wmem_max = 513920
Net.core.netdev_max_backlog = 2000
Net.core.somaxconn = 2048
Net.core.optmem_max = 81920
Net.ipv4.tcp_mem = 131072 262144 524288
Net.ipv4.tcp_rmem = 8760 256960 4088000
Net.ipv4.tcp_wmem = 8760 256960 4088000
Net.ipv4.tcp_keepalive_time = 1800
NET.IPV4.TCP_KEEPALIVE_INTVL = 30
Net.ipv4.tcp_keepalive_probes = 3
Net.ipv4.tcp_sack = 1
Net.ipv4.tcp_fack = 1
Net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
Net.ipv4.tcp_syncookies = 1
Net.ipv4.tcp_tw_reuse = 1
Net.ipv4.tcp_tw_recycle = 1
Net.ipv4.tcp_fin_timeout = 30
Net.ipv4.ip_local_port_range = 1024 65000
Net.ipv4.tcp_max_syn_backlog = 2048

DOC2:

There are two main interfaces to the tunable kernel variables: The sysctl command and the/proc file system, and all the process-independent information in Proc is ported to SYSFS. The sysctl parameter of the IPV4 protocol stack is mainly Sysctl.net.core, Sysctl.net.ipv4, and the corresponding/proc file system is/proc/sys/net/ipv4 and/proc/sys/net/core. Only the kernel contains a specific property at compile time, and the parameter appears in the kernel.

For kernel parameters should be carefully adjusted, these parameters usually affect the overall performance of the system. The kernel initializes specific variables at startup based on the resource conditions of the system, which typically satisfies the usual performance requirements.

The application communicates with the remote host through the socket system call, and each socket has a read-write buffer. The read buffer holds the data sent by the remote host, and if the buffer is full, the data is discarded, the write buffer period holds the data to be sent to the remote host, and if the write buffer is slow, the system's application is blocked when writing the data. It is known that the buffer is of a size.

Default size of socket buffer:
/proc/sys/net/core/rmem_default Correspondence Net.core.rmem_default
/proc/sys/net/core/wmem_default Correspondence Net.core.wmem_default
Above is the default read-write buffer size for each type of socket, but for a particular type of socket you can set a separate value to override the default value size. For example, the TCP type socket can be overwritten with/proc/sys/net/ipv4/tcp_rmem and Tcp_wmem.

Socket buffer Maximum:
/proc/sys/net/core/rmem_max Correspondence Net.core.rmem_max
/proc/sys/net/core/wmem_max Correspondence Net.core.wmem_max

/proc/sys/net/core/netdev_max_backlog Correspondence Net.core.netdev_max_backlog
This parameter defines the maximum number of messages in the input queue of the device when the rate at which the interface receives the packet is greater than the rate of the kernel processing packet.

/proc/sys/net/core/somaxconn Correspondence Net.core.somaxconn
The maximum accept queue backlog that can be specified by the Listen system call is discarded when the queued request connection is larger than the value.

/proc/sys/net/core/optmem_max Correspondence Net.core.optmem_max
The secondary buffer size for each socket.

Tcp/ipv4 Kernel Parameters:
The Socke protocol and address type are specified when the socket is created. The TCP socket buffer size is controlled by his own control rather than by the core kernel buffer.
/proc/sys/net/ipv4/tcp_rmem Correspondence Net.ipv4.tcp_rmem
/proc/sys/net/ipv4/tcp_wmem Correspondence Net.ipv4.tcp_wmem
The above is the TCP socket read/write buffer settings, each item has three values, the first value is the minimum buffer, the middle value is the default value of the buffer, the last is the maximum value of the buffer, although the value of the buffer is not limited by the value of the core buffer, However, the maximum value of the buffer is still limited to the maximum value of the core.

/proc/sys/net/ipv4/tcp_mem
The kernel parameter also includes three values to define the scope of memory management, the first value means that when the page number is below this value, TCP does not consider him as memory pressure, the second value is the number of pages reached in the pressure area of the memory, the third value is all TCP The maximum number of page sockets is allowed to be used, after which the subsequent message is discarded. Page is the amount of memory that is globally allocated for the socket in the system, in pages.

The structure of the socket is as follows:

/proc/sys/net/ipv4/tcp_window_scaling Correspondence net.ipv4.tcp_window_scaling
Manages the window scaling characteristics of TCP because the length of the receive buffer declared in the TCP header is 26 bits, so the window cannot be larger than 64K, and if it is greater than 64K, the window scaling is turned on.

/proc/sys/net/ipv4/tcp_sack Correspondence Net.ipv4.tcp_sack
Manages the selective response of TCP, which allows the receiving end to transmit the missing serial number in the byte stream, reduces the number of segments that need to be re-transmitted when the segment is lost, and sack is useful when the segment is lost frequently.

/proc/sys/net/ipv4/tcp_dsack Correspondence Net.ipv4.tcp_dsack
is an improvement to the sack that detects unnecessary retransmission.

/proc/sys/net/ipv4/tcp_fack Correspondence Net.ipv4.tcp_fack
The sack protocol is perfected and the congestion control mechanism of TCP is improved.

Connection Management for TCP:
/proc/sys/net/ipv4/tcp_max_syn_backlog Correspondence Net.ipv4.tcp_max_syn_backlog
Each connection request (SYN message) needs to be queued until the local server receives the variable, which is the TCP syn queue length that controls each port. If the connection request is extra, the request is discarded.

/proc/sys/net/ipv4/tcp_syn_retries Correspondence Net.ipv4.tcp_syn_retries
The control kernel re-sends the corresponding number of times to an input syn/ack segment, and the low value can better detect the connection failure of the remote host. Can be modified to 3

/proc/sys/net/ipv4/tcp_retries1 Correspondence Net.ipv4.tcp_retries1
This variable sets how many retries are required before giving up the response to a TCP connection request.

/proc/sys/net/ipv4/tcp_retries2 Correspondence Net.ipv4.tcp_retries2
Control the number of times that the kernel re-sends data to a remote host that has established a connection, and the low value can detect a connection that is not valid to the remote host earlier, so the server can release the connection more quickly and can be modified to 5

Retention of TCP connections:
/proc/sys/net/ipv4/tcp_keepalive_time Correspondence Net.ipv4.tcp_keepalive_time
If the connection is always idle within the number of seconds specified by this parameter, the kernel initiates a probe to that host to the client

/PROC/SYS/NET/IPV4/TCP_KEEPALIVE_INTVL Correspondence NET.IPV4.TCP_KEEPALIVE_INTVL
This parameter, in seconds, specifies the time interval for the kernel to send probe pointers to the remote host

/proc/sys/net/ipv4/tcp_keepalive_probes Correspondence Net.ipv4.tcp_keepalive_probes
This parameter specifies the number of probe pointers sent by the kernel in order to detect the survival of the remote host, and if the number of probe pointers is already in use, the client is still not responding, which is to conclude that the client is unreachable, close the connection to the client, and release the associated resources.

/proc/sys/net/ipv4/ip_local_port_range Correspondence Net.ipv4.ip_local_port_range
Specifies the range of local ports available for the TCP/UDP.

Recycling of TCP connections:
/proc/sys/net/ipv4/tcp_max_tw_buckets Correspondence Net.ipv4.tcp_max_tw_buckets
This parameter sets the number of time_wait for the system and is cleared immediately if the default value is exceeded.

/proc/sys/net/ipv4/tcp_tw_reuse Correspondence Net.ipv4.tcp_tw_reuse
This parameter sets Time_wait reuse, which allows the connection in time_wait to be used for a new TCP connection

/proc/sys/net/ipv4/tcp_tw_recycle Correspondence Net.ipv4.tcp_tw_recycle
This parameter sets a quick recycle of time_wait in a TCP connection.

/proc/sys/net/ipv4/tcp_fin_timeout Correspondence Net.ipv4.tcp_fin_timeout
Sets the wait time for Time_wait2 to enter closed.

/proc/sys/net/ipv4/route/max_size
The maximum number of routes allowed by the kernel.

/proc/sys/net/ipv4/ip_forward
Forwarding messages between interfaces

/proc/sys/net/ipv4/ip_default_ttl
Maximum number of hops that a message can pass

Virtual Memory Parameters:
/proc/sys/vm/

The maximum number of open file handles per process set by Ulimit-n (Setrlimit (Rlimit_nofile)) before Linux kernel 2.6.25 cannot exceed Nr_open (1024*1024), which is more than 100 W ( Unless the kernel is recompiled), after 25, the kernel exports a SYS interface to modify this maximum value of/proc/sys/fs/nr_open. The shell cannot be changed directly, because Pam has set the upper limit from limits.conf when logging in, the Ulimit command can only play within the range below the upper limit.

View the socket status in Linux:
Cat/proc/net/sockstat # (This is IPv4 's)

Sockets:used 137
Tcp:inuse Orphan 0 tw 3272 alloc mem 46
Udp:inuse 1 Mem 0
Raw:inuse 0
Frag:inuse 0 Memory 0
Description
Sockets:used: Total amount of all protocol sockets used
Tcp:inuse: The number of TCP sockets that are being used (listening). Its value ≤NETSTAT–LNT | grep ^tcp | Wc–l
Tcp:orphan: Number of TCP connections with no primary (not part of any process) (useless, number of TCP sockets to be destroyed)
TCP:TW: Number of TCP connections waiting to be closed. Its value equals Netstat–ant | grep time_wait | Wc–l
Tcp:alloc (Allocated): The number of TCP sockets that have been allocated (established, requested to sk_buff). Its value equals Netstat–ant | grep ^tcp | Wc–l
TCP:MEM: Socket buffer usage (unknown). Measured in SCP, the speed at 4803.9kb/s: its value =11,netstat–ant the corresponding 22 port of the recv-q=0,send-q≈400)
Udp:inuse: Number of UDP sockets in use
RAW:
FRAG: Number of IP segments used

Reference: http://www.mjmwired.net/kernel/Documentation/sysctl/

Linux kernel TCP/IP, socket parameter tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.