Improve Linux system performance and accelerate network applications

Source: Internet
Author: User
Tags echo command telnet program

In development
In socket applications, the first task is to ensure reliability and meet specific requirements. Use the 4
TIPS: you can design and develop the socket program to achieve the best performance from the beginning. This article includes the use of sockets APIs and two
Socket options and GNU/Linux optimization.

Follow these skills to develop applications with superior performance:

Minimize the delay of message transmission.

Minimize the load of system calls.

Adjusts the TCP window for the bandwidth delay product.

Dynamic Optimization of the GNU/Linux TCP/IP stack.

Tip 1. Minimize packet transmission latency

In
When communication through TCP socket, data is split into data blocks, so that they can encapsulate the TCP payload for a given connection
The payload in the data packet is in progress. The size of tcppayload depends on several factors (such as the maximum message length and Path), but these factors are known at the time of connection initiation. To achieve the best
Performance, our goal is to use as much available data as possible to fill each packet. When there is not enough data to fill in payload (also known as the maximum packet segment length (maximum
Segment Size) or MSS), TCP uses Nagle
The algorithm automatically connects some small buffers to a packet segment. In this way, the application efficiency can be improved by minimizing the number of sent packets, and the overall network congestion problem can be reduced.

Do
Guan John
The Nagle algorithm can connect the data to a larger packet to minimize the number of sent packets, but sometimes you may want to send only a smaller packet. A simple example is:
Telnet program, which allows users to interact with the remote system.
. If a user is required to fill a segment with characters entered before sending the packet, this method cannot meet our needs.

Another example is the HTTP protocol. Generally, the client browser generates a small request (an HTTP request message), and the Web server returns a larger response (web page ).

Solution

The first thing you should consider is that the Nagle algorithm meets a requirement. Because this algorithm combines data and tries to form a complete TCP packet segment, it introduces some latency. However, this algorithm can minimize the number of packets sent online and thus minimize network congestion.

However, the Sockets API provides a solution to minimize the transmission latency. To disable the Nagle algorithm, you can set the tcp_nodelay socket option, as shown in Listing 1.

Int
Sock, flag, RET;/* createnew stream socket */sock = socket (af_inet,
Sock_stream, 0);/* disable the Nagle (TCP no delay) algorithm */flag
= 1; ret = setsockopt (sock, ipproto_tcp, tcp_nodelay, (char
*) & Flag, sizeof (FLAG); If (ret =-1 ){
Printf ("Couldn 'tsetsockopt (tcp_nodelay)/n"); exit (-1 );} Listing 1. Disable the Nagle Algorithm for TCP socket

Tip: Samba experiments show that disabling the Nagle algorithm almost doubles the read performance when reading data from a samba drive on a Microsoft Windows Server.

Tip 2. Minimize the load of system calls

Ren
When you use a socket to read and write data, you are using a system call ). This call (for example, read
Or write) across the boundaries between the user space application and the kernel. In addition, before entering the kernel, your call will pass
C library to enter a common function (system_call () in the kernel ()). Slave
In system_call (), this call enters the file system layer, where the kernel will determine which type of device is being processed. Finally, the call enters the socket
Layer, where data is read or queued for transmission through socket (this involves data copies ).

This process indicates that the system call is not only performed in the application and kernel, but also through many layers in the application and kernel. This process consumes a lot of resources, so the more calls, the longer the time required to work through this call chain, the lower the performance of the application.

Since we cannot avoid these system calls, the only choice is to minimize the number of times these calls are used. Fortunately, we can control this process.

Solution

In
Write data to
When using a socket, try to write all the data at a time, instead of performing multiple data writing operations. For read operations, it is best to pass in the maximum buffer that can be supported, because if there is not enough data, the kernel
It will also try to fill the entire buffer (and also need to keep the TCP Notification window open ). In this way, you can minimize the number of calls and achieve better overall performance.

Tip 3. Adjust the TCP window for the bandwidth Delay Product

TCP
The performance depends on several factors. The two most important factors are the link bandwidth (linkbandwidth) (packet transmission rate on the network) and
Round-trip time or RTT (the delay between sending a message and receiving a response from the other end ). The two values are called bandwidth.
Delay Product (BDP) content.

Given link bandwidth and RTT
Then, you can calculate the BDP value, but what does this mean? BDP provides a simple method to calculate the theoretically optimal TCP
The size of the socket buffer (which stores the data waiting for transmission and waiting for the application to receive ). If the buffer is too small
The TCP window cannot be fully opened, which will limit the performance. If the buffer area is too large, valuable memory resources will be wasted. If the buffer size you set is suitable, you can make full use of it.
Available bandwidth. Here is an example: BDP = link_bandwidth * RTT if the application is using a 100 Mbps
The RRT for 50 MS, then BDP is: 100 Mbps * 0.050 SEC/8 = 0.625 MB =
625kb Note: dividing by 8 is the byte used to convert a bit into a communication byte.

Therefore, you can set the TCP window to BDP or 1.25 MB. However, in Linux 2.6, the default TCP window size is 2.2 kb, which limits the connection bandwidth to Mbps. The calculation method is as follows:

Throughput = window_size/RTT

110kb/0.050 = 2.2 Mbps

If the window size calculated above is used, the bandwidth is 12.5 Mbps. The calculation method is as follows:

625kb/0.050 = 12.5 Mbps

The difference is indeed great, and it can provide a larger throughput for the socket. So now you know how to calculate the optimal buffer size for your socket. But how can we change it?

Solution

The Sockets API provides several socket options, two of which can be used to modify the size of the socket sending and receiving buffer. Listing 2 shows how to adjust the size of the sending and receiving buffer using the so_sndbuf and so_rcvbuf options.

Note: although the size of the socket buffer determines the size of the advertised TCP window, TCP maintains a congestion window in the notification window. Therefore, due to the existence of this congestion window, the given socket may never use the largest announcement window.

Int
RET, Sock, sock_buf_size; sock = socket (af_inet, sock_stream, 0 );
Sock_buf_size = BDP; ret = setsockopt (sock, sol_socket, so_sndbuf,
(Char *) & sock_buf_size, sizeof (sock_buf_size); ret = setsockopt (
Sock, sol_socket, so_rcvbuf, (char *) & sock_buf_size,
Sizeof (sock_buf_size ));

List 2. manually set the buffer size of the sending and receiving socket

In the Linux 2.6 kernel, the size of the sending buffer is defined by the caller, but the receiving buffer is automatically doubled. You can call getsockopt to verify the size of each buffer.

Just
For Window Scaling, TCP initially supports a window with a maximum size of 64 KB (use a 16-bit value to define the window size ). Use Window
After Scaling (RFC 1323) is extended, you can use a 32-bit value to indicate the window size. TCP/IP provided in GNU/Linux
Stack supports this option (and other options ).

Tip: Linux
The kernel also includes the ability to automatically optimize these socket buffers (see tcp_rmem in Table 1 below)
And tcp_wmem), but these options will affect the entire stack. If you only need to adjust the window size for a connection or a type of connection, this mechanism may not meet your needs.

Tip 4. dynamically optimize the GNU/Linux TCP/IP stack

The standard GNU/Linux release attempts to optimize various deployment conditions. This means that the standard release may not have special Optimizations to your environment.

Solution

GNU/Linux provides many adjustable kernel parameters that you can use to dynamically configure the operating system for your own purposes. Next, let's take a look at some of the more important options that affect socket performance.

In
Some adjustable kernel parameters exist in the/proc Virtual File System. Each file in this file system represents one or more parameters, which can be read by the CAT tool or
Modify the echo command. Listing 3 shows how to query or enable an adjustable parameter (in this case, IP Forwarding can be enabled on the TCP/IP stack ).

[Root @ Camus] #
CAT/proc/sys/NET/IPv4/ip_forward 0 [root @ Camus] # echo "1"
>/Poc/sys/NET/IPv4/ip_forward [root @ Camus] #
CAT/proc/sys/NET/IPv4/ip_forward 1 [root @ Camus] #

Listing 3. Optimization: Enable IP Forwarding in the TCP/IP stack

And
Like any tuning effort, the best way is to continuously conduct experiments. The behavior of your application, the speed of the processor, and the amount of memory available will affect the way these parameters affect performance. In some cases
And vice versa ). Therefore, we need to test each option one by one and then check the results of each option. In other words, we need to trust our own experience,
Verify each modification.

Tip: Here is a question about permanent configuration. Note, such
If you restart the GNU/Linux system, any adjustable kernel parameters you need will be restored to the default value. You can use
/Etc/sysctl. conf configure these parameters to the values you set when the system starts.

GNU/Linux tools

GNU/Linux
It is very attractive to me because there are many tools available for use. Although most of them are command line tools, they are both very useful and intuitive. GNU/Linux
Several tools are provided-some are provided by GNU/Linux, and some are open source software --
Used to debug network applications, measure bandwidth/throughput, and check the usage of links.

Ping is the most common tool used to check the availability of the host, but it can also be used to identify the RTT for bandwidth Delay Product computing.

Traceroute prints the path (route) of a series of routers and gateways attached to a network host to determine the delay between each hop.

Netstat identifies statistics about network subsystems, protocols, and connections.

Tcpdump displays the protocol-level message tracing information of one or more connections. It also includes the time information, which you can use to study the packet time of different protocol services.

Netlog provides some network performance information for applications.

Nettimer generates a metric for the bandwidth of the bottleneck link. It can be used for automatic protocol optimization.

Ethereal provides the tcpump (packet tracking) feature on an easy-to-use graphical interface.

Iperf measures the network performance of TCP and UDP, measures the maximum bandwidth, and reports the loss of latency and datagram.

Conclusion

Taste
Try to use the techniques and techniques described in this article to Improve the Performance of socket applications, including disabling the Nagle algorithm to reduce transmission latency and setting the buffer size to improve
The use of socket bandwidth reduces the load of system calls by minimizing the number of system calls, and optimizes the Linux TCP/IP stack by using adjustable kernel parameters.

The features of the application must be considered during optimization. For example, will your application communicate over the Internet based on a LAN? If your application only operates within the LAN, increasing the size of the socket buffer may not significantly improve, but enabling the jumbo frame will definitely improve the performance!

Finally, you must use tcpdump or Ethereal to check the optimized results. The changes seen at the packet level can help demonstrate the successful results after optimization using these technologies.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.