Configure the development of Linux applications that support high concurrent TCP connections

Source: Internet
Author: User
Tags epoll unix domain socket

1, modify the user process can open the number of files limit

On a Linux platform, regardless of whether you write a client or a service-side program, the highest concurrency is limited by the number of files that can be opened by the system at the same time as the user's single process (because the system creates a socket handle for each TCP connection) for high concurrent TCP connection processing. Each socket handle is also a file handle. You can use the Ulimit command to view the limit on the number of files that the system allows the current user process to open:
[Email protected] ~]$ ulimit-n
1024
This means that each process of the current user is allowed to open up to 1024 files at the same time, and that the 1024 files will have to remove the standard input, standard output, standard error, Server listener socket, UNIX domain socket for interprocess communication, etc. Then the remaining number of files available for the client socket connection is only about 1024-10 = 1014 or so. In other words, Linux-based communication programs allow up to 1014 simultaneous TCP connections by default.

For a communication handler that wants to support a higher number of TCP concurrent connections, you must modify the soft limit (soft limit) and the hard limit (hardlimit) of the number of files that Linux has open simultaneously for the current user's process. The soft limit refers to the Linux in the current system can withstand the extent to further limit the number of files opened by the user at the same time; hard limits are the number of files that can be opened at the same time based on the system's Hardware resource status (mainly system memory). The soft limit is usually less than or equal to the hard limit.

The simplest way to modify the above limitations is to use the Ulimit command:
[Email protected] ~]$ ulimit-n <file_num>
In the above command, in <file_num> specify the maximum number of files that a single process to set is allowed to open. If the system echoes similar to "Operation Notpermitted", the above limitation modification fails, in effect because the value specified in <file_num> exceeds the soft or hard limit of the number of open files that the Linux system has on the user. Therefore, it is necessary to modify the Linux system's soft and hard limits on the number of open files to the user.

The first step is to modify the/etc/security/limits.conf file to add the following line to the file:
Speng Soft Nofile 10240
Speng Hard Nofile 10240
Where Speng specifies the limit on the number of open files to modify which user, the ' * ' can be used to modify the restrictions of all users; soft or hard Specifies whether you want to modify the soft or rigid limit; 10240 Specifies the new limit value that you want to modify, that is, the maximum number of open files ( Note that the soft limit value is less than or equal to the hard limit. Save the file when you are finished modifying it.

In the second step, modify the/etc/pam.d/login file to add the following line to the file:
Session required/lib/security/pam_limits.so
This is to tell Linux that after the user completes the system login, the Pam_limits.so module should be called to set the system's maximum limit on the number of resources that the user can use, including the maximum number of files a user can open, and the pam_limits.so module from/etc/ The security/limits.conf file reads the configuration to set these throttling values. Save this file when you are finished modifying it.

The third step is to view the maximum number of open files at the Linux system level, using the following command:
[Email protected] ~]$ Cat/proc/sys/fs/file-max
12158
This indicates that the Linux system is allowed to open at most (that is, the total number of open files for all users) 12,158 files, is the Linux system-level hard limit, all user-level open files limit should not exceed this number. Typically, this system-level hard limit is the best maximum number of simultaneous open file limits that Linux systems will calculate at startup, based on the state of the system's hardware resources, and should not be modified unless you want to set a value that exceeds this limit for the user-level open files limit. The way to modify this hard limit is to modify the/etc/rc.local script to add the following line to the script:
echo 22158 >/proc/sys/fs/file-max
This is the hard limit for Linux to force the number of system-level open files to 22158 after boot is complete. Save this file when you are finished modifying it.

By rebooting the system after completing the above steps, it is generally possible to set the maximum number of files that the Linux system can allow to open simultaneously for a single process of a specified user to the specified number. If you use the Ulimit-n command to view the number of files that the user can open after restarting, the limit is still lower than the maximum value set in the previous steps, possibly because using the ulimit-n command in User logon script/etc/profile limits the number of files that the user can open simultaneously. Because modifying the system through Ulimit-n can limit the maximum number of files that a user can open at the same time, the newly modified value is only less than or equal to the value of the last Ulimit-n setting, so it is not possible to use this command to increase the limit value. Therefore, if there is such a problem, you can only open the/etc/profile script file, find out whether the file is used Ulimit-n limit the maximum number of files that the user can open at the same time, if found, delete this line of command, or set its value to the appropriate value, and then save the file, The user exits and logs back in to the system.
With the above steps, the system limits on the number of open files are lifted for communication handlers that support high concurrent TCP connection processing.

2. Modify the network kernel restrictions on TCP connections

When writing a client-side communication handler that supports high concurrent TCP connections on Linux, it is sometimes found that although the system has lifted the limit on the number of simultaneous open files for the user, there is still no way to successfully establish a new TCP connection when the number of concurrent TCP connections increases to a certain number. There are many reasons for this now.

The first reason may be because the Linux network kernel has a limit on the local port number range. At this point, further analysis of why the TCP connection could not be established, the problem will be found in the Connect () Call return failure, the view system error message is "Can ' t assign Requestedaddress". Also, if you use the Tcpdump tool to monitor the network at this time, you will find that the client sends a SYN packet of network traffic when there is no TCP connection at all. These conditions indicate a limitation in the local Linux system kernel. In fact, the root cause of the problem is that the TCP/IP Protocol implementation module of the Linux kernel restricts the range of local port numbers that correspond to all client TCP connections in the system (for example, the kernel restricts the range of local port numbers to 1024~32768). When there are too many TCP client connections at a time in the system, because each TCP client connection consumes a unique local port number (this port number is within the system's local port number range limit), if an existing TCP client connection has filled all the local port numbers, this , the new TCP client connection cannot be assigned a local port number, so the system returns a failure in the Connect () call and sets the error message to "Can ' t assignrequested address." For these control logic you can view the Linux kernel source code, take the linux2.6 kernel as an example, you can view the following functions in the tcp_ipv4.c file:
static int tcp_v4_hash_connect (struct sock *sk)
Notice the access control of the variable Sysctl_local_port_range in the above function. The initialization of the variable sysctl_local_port_range is set in the following function in the Tcp.c file:
void __init tcp_init (void)
The local port number range, which is set by default at kernel compile time, may be too small, so you need to modify this local range limit.
The first step is to modify the/etc/sysctl.conf file to add the following line to the file:
Net.ipv4.ip_local_port_range = 1024 65000
This indicates that the system is set to a local port range limit of 1024~65000. Note that the minimum value for the local port range must be greater than or equal to 1024, while the maximum value for the port range should be less than or equal to 65535. Save this file when you are finished modifying it.
The second step is to execute the SYSCTL command:
[Email protected] ~]$ sysctl-p
If the system does not have an error prompt, it indicates that the new local port range setting was successful. If set according to the above port range, it is theoretically possible for a single process to establish up to 60,000 TCP client connections at the same time.

The second reason why a TCP connection cannot be established may be because the ip_table firewall of the Linux network core has a limit on the number of TCP connections that can be traced. At this point the program will appear to block in the Connect () call, as if the crash, if you use the Tcpdump tool to monitor the network, you will also find that there is no TCP connection when the client sends a SYN packet network traffic. Since the ip_table firewall keeps track of the status of each TCP connection in the kernel, the trace information will be placed in the kernel memory conntrackdatabase, the size of the database is limited, when there are too many TCP connections in the system, the database capacity is insufficient, IP_ Table could not establish trace information for a new TCP connection, so it behaves as blocked in the Connect () call. At this point, the kernel must be modified to limit the number of TCP connections to the maximum trace, similar to the limitation of modifying the kernel to the range of local port numbers:
The first step is to modify the/etc/sysctl.conf file to add the following line to the file:
Net.ipv4.ip_conntrack_max = 10240
This indicates that the system limits the number of TCP connections to maximum traces to 10240. Note that this limit value should be as small as possible to conserve the kernel memory.
The second step is to execute the SYSCTL command:
[Email protected] ~]$ sysctl-p
If the system does not have an error prompt, it indicates that the system has succeeded in limiting the number of TCP connections to the new maximum trace. If set according to the above parameters, it is theoretically possible for a single process to establish a maximum of more than 10,000 TCP client connections at the same time.

3. Using programming techniques that support high concurrency network I/O

When writing high concurrent TCP connection applications on Linux, you must use the appropriate network I/O technology and the I/Os event dispatch mechanism.

The available I/O technology has synchronous I/O, non-blocking synchronous I/O (also called reactive I/O), and asynchronous I/O. In the case of high TCP concurrency, if synchronous I/O is used, this can seriously block the operation of the program unless a thread is created for the I/O for each TCP connection. However, too many threads can cause significant overhead for the system's scheduling of threads. Therefore, it is undesirable to use synchronous I/O in cases of high TCP concurrency, when you consider using non-blocking synchronous I/O or asynchronous I/O. Non-blocking synchronous I/O techniques include the use of Select (), poll (), Epoll, and so on. The technique of asynchronous I/O is to use AIO.

From the I/O event dispatch mechanism, it is inappropriate to use Select () because it supports a limited number of concurrent connections (typically within 1024). If you consider performance, poll () is also inappropriate, although it can support a higher number of TCP concurrency, but because of its "polling" mechanism, when the number of concurrent high, its efficiency is very low, and there may be an I/O event dispatch uneven, causing the I/O on some TCP connections "hunger" phenomenon. If you use Epoll or AIO, there is no such problem (the AIO technology implementation of the earlier Linux kernel is implemented by creating a thread for each I/O request in the kernel, which in fact has a serious performance problem with high concurrent TCP connections.) However, the implementation of AIO has been improved in the latest Linux kernel.

In summary, when developing Linux applications that support high concurrent TCP connections, you should try to use Epoll or AIO technology to implement I/O control on concurrent TCP connections, which provides an effective I/O guarantee for the boost program's support for high concurrent TCP connections.


1, modify the user process can open the number of files limit

On a Linux platform, regardless of whether you write a client or a service-side program, the highest concurrency is limited by the number of files that can be opened by the system at the same time as the user's single process (because the system creates a socket handle for each TCP connection) for high concurrent TCP connection processing. Each socket handle is also a file handle. You can use the Ulimit command to view the limit on the number of files that the system allows the current user process to open:
[Email protected] ~]$ ulimit-n
1024
This means that each process of the current user is allowed to open up to 1024 files at the same time, and that the 1024 files will have to remove the standard input, standard output, standard error, Server listener socket, UNIX domain socket for interprocess communication, etc. Then the remaining number of files available for the client socket connection is only about 1024-10 = 1014 or so. In other words, Linux-based communication programs allow up to 1014 simultaneous TCP connections by default.

For a communication handler that wants to support a higher number of TCP concurrent connections, you must modify the soft limit (soft limit) and the hard limit (hardlimit) of the number of files that Linux has open simultaneously for the current user's process. The soft limit refers to the Linux in the current system can withstand the extent to further limit the number of files opened by the user at the same time; hard limits are the number of files that can be opened at the same time based on the system's Hardware resource status (mainly system memory). The soft limit is usually less than or equal to the hard limit.

The simplest way to modify the above limitations is to use the Ulimit command:
[Email protected] ~]$ ulimit-n <file_num>
In the above command, in <file_num> specify the maximum number of files that a single process to set is allowed to open. If the system echoes similar to "Operation Notpermitted", the above limitation modification fails, in effect because the value specified in <file_num> exceeds the soft or hard limit of the number of open files that the Linux system has on the user. Therefore, it is necessary to modify the Linux system's soft and hard limits on the number of open files to the user.

The first step is to modify the/etc/security/limits.conf file to add the following line to the file:
Speng Soft Nofile 10240
Speng Hard Nofile 10240
Where Speng specifies the limit on the number of open files to modify which user, the ' * ' can be used to modify the restrictions of all users; soft or hard Specifies whether you want to modify the soft or rigid limit; 10240 Specifies the new limit value that you want to modify, that is, the maximum number of open files ( Note that the soft limit value is less than or equal to the hard limit. Save the file when you are finished modifying it.

In the second step, modify the/etc/pam.d/login file to add the following line to the file:
Session required/lib/security/pam_limits.so
This is to tell Linux that after the user completes the system login, the Pam_limits.so module should be called to set the system's maximum limit on the number of resources that the user can use, including the maximum number of files a user can open, and the pam_limits.so module from/etc/ The security/limits.conf file reads the configuration to set these throttling values. Save this file when you are finished modifying it.

The third step is to view the maximum number of open files at the Linux system level, using the following command:
[Email protected] ~]$ Cat/proc/sys/fs/file-max
12158
This indicates that the Linux system is allowed to open at most (that is, the total number of open files for all users) 12,158 files, is the Linux system-level hard limit, all user-level open files limit should not exceed this number. Typically, this system-level hard limit is the best maximum number of simultaneous open file limits that Linux systems will calculate at startup, based on the state of the system's hardware resources, and should not be modified unless you want to set a value that exceeds this limit for the user-level open files limit. The way to modify this hard limit is to modify the/etc/rc.local script to add the following line to the script:
echo 22158 >/proc/sys/fs/file-max
This is the hard limit for Linux to force the number of system-level open files to 22158 after boot is complete. Save this file when you are finished modifying it.

By rebooting the system after completing the above steps, it is generally possible to set the maximum number of files that the Linux system can allow to open simultaneously for a single process of a specified user to the specified number. If you use the Ulimit-n command to view the number of files that the user can open after restarting, the limit is still lower than the maximum value set in the previous steps, possibly because using the ulimit-n command in User logon script/etc/profile limits the number of files that the user can open simultaneously. Because modifying the system through Ulimit-n can limit the maximum number of files that a user can open at the same time, the newly modified value is only less than or equal to the value of the last Ulimit-n setting, so it is not possible to use this command to increase the limit value. Therefore, if there is such a problem, you can only open the/etc/profile script file, find out whether the file is used Ulimit-n limit the maximum number of files that the user can open at the same time, if found, delete this line of command, or set its value to the appropriate value, and then save the file, The user exits and logs back in to the system.
With the above steps, the system limits on the number of open files are lifted for communication handlers that support high concurrent TCP connection processing.

2. Modify the network kernel restrictions on TCP connections

When writing a client-side communication handler that supports high concurrent TCP connections on Linux, it is sometimes found that although the system has lifted the limit on the number of simultaneous open files for the user, there is still no way to successfully establish a new TCP connection when the number of concurrent TCP connections increases to a certain number. There are many reasons for this now.

The first reason may be because the Linux network kernel has a limit on the local port number range. At this point, further analysis of why the TCP connection could not be established, the problem will be found in the Connect () Call return failure, the view system error message is "Can ' t assign Requestedaddress". Also, if you use the Tcpdump tool to monitor the network at this time, you will find that the client sends a SYN packet of network traffic when there is no TCP connection at all. These conditions indicate a limitation in the local Linux system kernel. In fact, the root cause of the problem is that the TCP/IP Protocol implementation module of the Linux kernel restricts the range of local port numbers that correspond to all client TCP connections in the system (for example, the kernel restricts the range of local port numbers to 1024~32768). When there are too many TCP client connections at a time in the system, because each TCP client connection consumes a unique local port number (this port number is within the system's local port number range limit), if an existing TCP client connection has filled all the local port numbers, this , the new TCP client connection cannot be assigned a local port number, so the system returns a failure in the Connect () call and sets the error message to "Can ' t assignrequested address." For these control logic you can view the Linux kernel source code, take the linux2.6 kernel as an example, you can view the following functions in the tcp_ipv4.c file:
static int tcp_v4_hash_connect (struct sock *sk)
Notice the access control of the variable Sysctl_local_port_range in the above function. The initialization of the variable sysctl_local_port_range is set in the following function in the Tcp.c file:
void __init tcp_init (void)
The local port number range, which is set by default at kernel compile time, may be too small, so you need to modify this local range limit.
The first step is to modify the/etc/sysctl.conf file to add the following line to the file:
Net.ipv4.ip_local_port_range = 1024 65000
This indicates that the system is set to a local port range limit of 1024~65000. Note that the minimum value for the local port range must be greater than or equal to 1024, while the maximum value for the port range should be less than or equal to 65535. Save this file when you are finished modifying it.
The second step is to execute the SYSCTL command:
[Email protected] ~]$ sysctl-p
If the system does not have an error prompt, it indicates that the new local port range setting was successful. If set according to the above port range, it is theoretically possible for a single process to establish up to 60,000 TCP client connections at the same time.

The second reason why a TCP connection cannot be established may be because the ip_table firewall of the Linux network core has a limit on the number of TCP connections that can be traced. At this point the program will appear to block in the Connect () call, as if the crash, if you use the Tcpdump tool to monitor the network, you will also find that there is no TCP connection when the client sends a SYN packet network traffic. Since the ip_table firewall keeps track of the status of each TCP connection in the kernel, the trace information will be placed in the kernel memory conntrackdatabase, the size of the database is limited, when there are too many TCP connections in the system, the database capacity is insufficient, IP_ Table could not establish trace information for a new TCP connection, so it behaves as blocked in the Connect () call. At this point, the kernel must be modified to limit the number of TCP connections to the maximum trace, similar to the limitation of modifying the kernel to the range of local port numbers:
The first step is to modify the/etc/sysctl.conf file to add the following line to the file:
Net.ipv4.ip_conntrack_max = 10240
This indicates that the system limits the number of TCP connections to maximum traces to 10240. Note that this limit value should be as small as possible to conserve the kernel memory.
The second step is to execute the SYSCTL command:
[Email protected] ~]$ sysctl-p
If the system does not have an error prompt, it indicates that the system has succeeded in limiting the number of TCP connections to the new maximum trace. If set according to the above parameters, it is theoretically possible for a single process to establish a maximum of more than 10,000 TCP client connections at the same time.

3. Using programming techniques that support high concurrency network I/O

When writing high concurrent TCP connection applications on Linux, you must use the appropriate network I/O technology and the I/Os event dispatch mechanism.

The available I/O technology has synchronous I/O, non-blocking synchronous I/O (also called reactive I/O), and asynchronous I/O. In the case of high TCP concurrency, if synchronous I/O is used, this can seriously block the operation of the program unless a thread is created for the I/O for each TCP connection. However, too many threads can cause significant overhead for the system's scheduling of threads. Therefore, it is undesirable to use synchronous I/O in cases of high TCP concurrency, when you consider using non-blocking synchronous I/O or asynchronous I/O. Non-blocking synchronous I/O techniques include the use of Select (), poll (), Epoll, and so on. The technique of asynchronous I/O is to use AIO.

From the I/O event dispatch mechanism, it is inappropriate to use Select () because it supports a limited number of concurrent connections (typically within 1024). If you consider performance, poll () is also inappropriate, although it can support a higher number of TCP concurrency, but because of its "polling" mechanism, when the number of concurrent high, its efficiency is very low, and there may be an I/O event dispatch uneven, causing the I/O on some TCP connections "hunger" phenomenon. If you use Epoll or AIO, there is no such problem (the AIO technology implementation of the earlier Linux kernel is implemented by creating a thread for each I/O request in the kernel, which in fact has a serious performance problem with high concurrent TCP connections.) However, the implementation of AIO has been improved in the latest Linux kernel.

In summary, when developing Linux applications that support high concurrent TCP connections, you should try to use Epoll or AIO technology to implement I/O control on concurrent TCP connections, which provides an effective I/O guarantee for the boost program's support for high concurrent TCP connections.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.