War c100k-linux Kernel tuning Chapter-Reprint

Source: Internet
Author: User
Tags unix domain socket

Original address: http://joyexpr.com/2013/11/22/c100k-4-kernel-tuning/

Early systems, system resources including CPU, memory and so on are very limited, the system in order to maintain fairness, the default is to limit the process of resource usage. Because Linux's default kernel configuration does not meet the requirements of c100k, it needs to be tuned appropriately.

We can ulimit look at the typical machine default limitations:

$ ulimit-acoreFileSize (blocks,-c)0data SEGSize (Kbytes,-D) unlimitedscheduling priority (-e)0file size (blocks,-f) unlimitedpending signals (-i) 204800max locked memory (Kbytes,-L) 32max memory size (Kbytes,-M) Unlimitedopen files (-N) 1024pipe size (512 bytes,-p ) 8posix message queues (bytes,-Q) 819200real-time priority (-R) 0stack size (Kbytes,-s) 10240cpu time (seconds,-T) unlimited< span class= "keyword" >max user processes (-u) 204800virtual memory (Kbytes, -V) unlimitedfile locks (-X) unlimited        

For example open files , the default one process can open the number of file handles is 1024, for some need a large number of file handle programs, such as Web server, database program, etc., 1024 is often not enough, when the handle is finished, the system will frequently appear emfile error.

As the saying goes: a slap does not sound, to complete C100K the goal, the server side and the client's close cooperation, the following will be the tuning of the two are introduced.

Client 1: The number of file handles is limited

On a Linux platform, whether you are writing a client program or a service-side program, when you are processing high concurrent TCP connections, each TCP connection creates a socket handle, and each socket handle is also a file handle. So its maximum concurrency is limited by the number of files that the system can open at the same time as the user's single process and the number of files that can be opened simultaneously.

1.1: Limited number of file handles for a single process

We can ulimit the command to see the limit on the number of file handles that the current user process can open:

[root@localhost ~]# ulimit -n1024

This means that each process of the current user is allowed to open up to 1024 files at a time, except for the standard input, standard output, standard error, Server listener socket, and UNIX domain socket for interprocess communication that each process must open. The remaining number of files available for the client socket connection is only about 1024-10 = 1014 or so. That is, by default, Linux-based communication programs allow up to 1014 simultaneous TCP concurrent connections.

For a communication handler that wants to support a higher number of TCP concurrent connections, you must modify the soft limit (soft limit) and the hard limit (hardlimit) of the number of files that Linux can open simultaneously for the current user's process. which

    • Soft limit refers to how Linux can further limit the number of files that a user can open at the same time as the current system is able to withstand.
    • A hard limit is the number of files that can be opened at the same time based on the system's Hardware resource status (mainly system memory).

The soft limit is usually less than or equal to the hard limit, and the soft and hard limits can be viewed through the ulimit command:

[root@localhost ~]# ulimit -Sn1024[root@localhost ~]# ulimit -Hn4096

There are 2 ways to modify the number of file handles that a single process can open at the same time:

1, direct use of ULIMIT commands, such as:

[root@localhost ~]# ulimit -n 1048576

After successful execution, the values of Ulimit N, Sn, and HN are changed to 1048576. However, the value set by this method is only valid at the current terminal, and the value set cannot be higher than the value set in Method 2.

2, to the /etc/security/limits.conf file, add or modify:

* soft nofile 1048576* hard nofile 1048576

which

    • *Delegates are valid for all users, and you can replace asterisks only if you want to target a user.
    • Soft is the soft limit, which is just a warning value.
    • Hard represents a rigid limit, which is a true threshold value that exceeds the error.
    • Nofile indicates the maximum number of open files.
    • 1048576 = 1024 * 1024, why do you want to take this value? Because

The maximum number of open file handles per process set by Ulimit-n (Setrlimit (Rlimit_nofile)) before Linux kernel 2.6.25 cannot exceed Nr_open (1024*1024). That is, more than 100 W (unless the kernel is recompiled), and after 25, the kernel exports a SYS interface to modify the maximum value (/proc/sys/fs/nr_open). Specific changelog in Https://git.kernel.org/cgit /linux/kernel/git/torvalds/linux.git/commit/?id=9cfe015aa424b3c003baba3841a60dd9b5ad319b

Note After the file is saved, you need to log off or restart the system to take effect.

1.2: Limited number of file handles throughout the system

After resolving the limit on the number of file handles for a single process, the number of file handles for the entire system is limited. We can view the maximum number of open files at the Linux system level by using the following command:

[root@localhost ~]# cat /proc/sys/fs/file-max98957

File-max represents the maximum number of file handles that all processes in the system are allowed to open at the same time, which is the Linux system-level hard limit. Typically, this system hard limit is the best maximum number of simultaneous open files that the Linux system calculates based on the state of the system's hardware resources at startup, and should not be modified if there is no special need.

To modify it, you need to /etc/sysctl.conf add a line to the file:

fs.file-max = 1048576

After the save is successful, execute the following command to make it effective:

[root@localhost ~]# sysctl -p
2: Limited number of ports

After resolving the problem of limited number of file handles, it is necessary to solve the problem that the number of IP ports is limited. Generally speaking, the service side to provide the request does not consider the port number problem, just listen to a certain port. The client wants to simulate a large number of users initiating TCP requests to the server, and each request requires a port, so that a client can impersonate more users as much as possible, so that the client will have more ports to use.

Because the port is 16, that is, the maximum number of ports is 2 of 16 times 65536 (0-65535). On Linux systems, 1024 of the following ports can only be used by Super administrator users (such as root), and normal users can only use port values greater than or equal to 1024.

We can view the default port ranges provided by the system with the following command:

[root@localhost ~]# cat /proc/sys/net/ipv4/ip_local_port_range32768 61000

That is, only 61000-32768 = 28,232 ports can be used, that is, a single IP can only send 28,232 TCP requests on the other.

There are 2 ways to modify the method:

1. Execute the following command:

"1024 65535"> /proc/sys/net/ipv4/ip_local_port_range

This method takes effect immediately, but is invalidated after a reboot.

2, modify /etc/sysctl.conf the file, add a line of content:

net.ipv4.ip_local_port_range = 1024 65535

After the save is successful, execute the following command to make it effective:

[root@localhost ~]# sysctl -p

After the modification succeeds, the available ports increase to 65535-1024 = 64,511, which means that a single client machine can impersonate only 64,511 users at a time. To break this limit, you can only add IP addresses to the client, so you can multiply the number of available ip:port accordingly. Refer to this article of Yongboy for details.

Server 1: The number of file descriptors is limited

Problem with the client 1.

2:TCP parameter Tuning

To improve the performance of the service side, in order to achieve our high concurrency, we need to optimize the TCP parameters of the system for proper modification.

The same method is /etc/sysctl.conf to modify the file to add the following content:

net.ipv4.tcp_tw_reuse = 1 

When the server needs to switch between a large number of TCP connections, it generates a large number of connections in the TIME_WAIT state. Time_wait means that the connection itself is closed, but the resource has not yet been released. Setting Net_ipv4_tcp_tw_reuse to 1 allows the kernel to reclaim connections when it is secure, which is much cheaper than re-establishing a new connection.

net.ipv4.tcp_fin_timeout = 15

This is the minimum time that a connection in the TIME_WAIT state must wait before being recycled. Smaller it can speed up recycling.

net.core.rmem_max = 16777216net.core.wmem_max = 16777216

Increase the maximum buffer size for TCP, where:

net.core.rmem_max: Represents the maximum value, in bytes, of the receive socket buffer size.

net.core.wmem_max: Represents the maximum value, in bytes, of the size of the send socket buffer.

net.ipv4.tcp_rmem = 4096 87380 16777216net.ipv4.tcp_wmem = 4096 65536 16777216

Improves the ability of the Linux kernel to automatically optimize the socket buffer, where:

net.ipv4.tcp_rmem: Used to configure the read buffer size, the 1th value is the minimum value, the 2nd value is the default value, and the 3rd value is the maximum value.

net.ipv4.tcp_wmem: Used to configure the write buffer size, the 1th value is the minimum value, the 2nd value is the default value, and the 3rd value is the maximum value.

net.core.netdev_max_backlog = 4096

The maximum number of packets that are allowed to be sent to the queue when each network interface receives a packet at a rate that is faster than the rate at which the kernel processes these packets. The default is 1000.

net.core.somaxconn = 4096

Represents the backlog limit for socket snooping (listen). What is a backlog? The backlog is the listener queue for the socket, and when a request has not been processed or established, he enters the backlog. The socket server can process all requests in the backlog at once, and the processed requests are no longer in the listening queue. When the server processes the request so slowly that the listening queue is filled, the new request is rejected. The default is 128.

net.ipv4.tcp_max_syn_backlog = 20480

Represents the length of the SYN queue, which defaults to 1024, and a larger queue length of 8192, which can accommodate more network connections waiting to be connected.

net.ipv4.tcp_syncookies = 1

Indicates that SYN Cookies are turned on. When a SYN wait queue overflow occurs, cookies are enabled to protect against a small number of SYN attacks, which by default is 0, which means close.

net.ipv4.tcp_max_tw_buckets = 360000

Indicates that the system maintains the maximum number of time_wait sockets at the same time, and if this number is exceeded, the time_wait socket is immediately cleared and a warning message is printed. The default is 180000.

net.ipv4.tcp_no_metrics_save = 1

After a TCP connection is closed, the parameters such as slow-start threshold snd_sthresh, congestion window Snd_cwnd, and Srtt are saved to dst_entry, as long as the dst_entry is not invalidated, The next time you create the same connection, you can use the saved parameters to initialize the connection.

net.ipv4.tcp_syn_retries = 2

Indicates the number of SYN packets sent before the kernel abandons the connection, default is 4.

net.ipv4.tcp_synack_retries = 2

Indicates the number of Syn+ack packets sent before the kernel abandons the connection, which defaults to 5.

The full TCP parameter tuning configuration is as follows:

Net.ipv4.Tcp_tw_reuse =1net.ipv4.Tcp_fin_timeout =15net.core.Rmem_max =16777216net.core.Wmem_max =16777216net.ipv4.Tcp_rmem =4096 87380 16777216net.ipv4.  Tcp_wmem = 4096 65536 16777216net.core.  Netdev_max_backlog = 4096net.core.  Somaxconn = 4096net.ipv4.  Tcp_max_syn_backlog = 20480net.ipv4.  Tcp_syncookies = 1net.ipv4.  Tcp_max_tw_buckets = 360000net.ipv4.  Tcp_no_metrics_save = 1net.ipv4.  Tcp_syn_retries = 2net.ipv4.  Tcp_synack_retries = 2              
Some other parameters
vm.min_free_kbytes = 65536

Used to determine the threshold at which the system starts to reclaim memory and to control the system's free memory. The higher the value, the sooner the kernel starts to reclaim memory, and the higher the free memory.

vm.swappiness = 0

The control kernel moves from the physical memory out of the process to the swap space. The parameter from 0 to 100, when this parameter = 0, indicates that whenever possible, try to avoid the swap process from moving out of physical memory; This parameter = 100, which tells the kernel to move data out of physical memory to the swap cache.

War c100k-linux Kernel tuning Chapter-Reprint

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.