Analysis of UDP packet loss in Linux system __linux

Source: Internet
Author: User

Https://www.tuicool.com/articles/7ni2yyr

Recently work encountered a server application UDP lost packet, in the process of checking a lot of information, summed up this article for more people's reference.

Before we begin, we'll use a diagram to explain the process of receiving a network message from a Linux system. First of all, network packets sent through the physical network to the NIC driver will read the message in the network into the ring buffer, the process using DMA (Direct Memory Access), does not require the CPU to participate in the kernel from the ring buffer to read the message processing, Execute the logic of IP and TCP/UDP layer, finally put the message into the application socket buffer, the application reads the message from the socket buffer to process

In the process of receiving a UDP message, any process in the diagram may discard the message actively or passively, so the packet loss may occur in the network card and driver, and may also occur in the system and application.

The reason why the data flow is not analyzed is because the sending process is similar to receiving, except in the opposite direction, and the probability of sending the process message is less than that of receiving, only occurs when the application sends a message rate greater than the kernel and network card processing rate.

This article assumes that the machine has only one name for eth0 interface, if there are more than one interface or interface name is not eth0, please follow the actual situation of the analysis.

Note: The RX (receive) appears in the article to receive the message, TX (transmit) means to send the message. Confirm UDP Packet loss occurs

To see if the NIC has lost packets, you can use ethtool-s eth0 to see if there is data in the output for bad or drop fields, and in normal circumstances, the corresponding number should be 0. If you see the corresponding number is growing, it indicates that the network card has lost packets.

Another command to view the packet loss data is ifconfig, its output will have RX (receive received message) and TX (transmit send paper) statistics:

~# ifconfig eth0 ...
        Rx Packets 3553389376  bytes 2599862532475 (2.3 TiB)
        Rx errors 0  dropped 1353  overruns 0  frame 0
        Tx Packets 3479495131  bytes 3205366800850 (2.9 TiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  Col Lisions 0 ...

In addition, the Linux system also provides packet loss information for each network protocol, can be viewed using the netstat-s command, plus--UDP can only look at UDP-related message data:

[root@holodesk02 god]# netstat-s-u icmpmsg:intype0:3 intype3:1719356 intype8:13 Ou Ttype0:13 outtype3:1737641 outtype8:10 outtype11:263 udp:517488890 packets received 2487375 Packe
    TS to unknown Port received. 47533568 packet Receive errors 147264581 packets sent 12851135 receive buffer errors 0 Send buffer errors UDPL ite:ipext:outmcastpkts:696 inbcastpkts:2373968 inoctets:4954097451540 outoctets:5538322535160 Ou tmcastoctets:79632 inbcastoctets:934783053 innoectpkts:5584838675 

For the above output, pay attention to the following information to view UDP packet loss: Packet receive errors is not empty, and has been increasing the system has UDP packet packets to unknown port received indicates the system received UDP The destination port where the message is located is not applied to the listener, typically the service is not started, and does not cause serious problems. Receive buffer errors indicates the number of packets dropped because UDP received too little cache

Note: It is not the number of lost packets are not zero, for UDP, if a small amount of packet loss is likely to be expected behavior, such as packet loss rate (packet loss/number of packets) at one out of 10,000 or even lower. NIC or driver lost packet

Before, if Ethtool-s eth0 has rx_***_errors then very likely is the network card has the problem, causes the system to lose the package, needs to contact the server or the network card supplier to carry on the processing.

# ethtool-s Eth0 | grep rx_ | grep errors
     rx_crc_errors:0
     rx_missed_errors:0
     rx_long_length_errors:0
     rx_short_length_errors:0
     rx_align_errors:0
     rx_errors:0
     rx_length_errors:0
     rx_over_errors:0
     rx_frame_errors:0
     rx_fifo_errors:0

Netstat-i will also provide each network card to receive the text and packet loss, the normal output error or drop should be 0.

If the hardware or driver is not a problem, the general network card is lost because the buffer set (ring buffer) is too small, you can use the Ethtool command to view and set the ring buffer of the network card.

Ethtool-g can view the ring buffer of a network card, such as the following example

# ethtool-g eth0 Ring
parameters for eth0:
pre-set maximums:
Rx:		4096
Rx Mini:	0
Rx Jumbo:	0
TX:		4096 current
hardware settings:
Rx:		256
Rx Mini:	0
Rx Jumbo:	0
TX:		256

Pre-set represents the maximum ring buffer value of the network card, which can be set using Ethtool-g eth0 Rx 8192. Linux system lost package

Linux system lost a lot of reasons, common are: UDP message error, firewall, UDP buffer size is insufficient, the system load is too high, the reasons for these lost packets are analyzed. UDP Message Error

If the UDP message is modified during transmission, it can cause checksum error, or length error, Linux will check this when it receives the UDP message, and discard the message once the invention is wrong.

If you want the UDP message checksum to be sent to the application in time, you can disable the UDP checksum check by using the socket parameter:

int disable = 1;
SetSockOpt (SOCK_FD, Sol_socket, So_no_check, (void*) &disable, sizeof (disable)
Firewalls

If the system firewall packet loss, performance behavior is generally all UDP message can not receive normal, of course, do not rule out the firewall only drop part of the message possibility.

If the packet loss ratio is very large, please check the firewall rules to ensure that the firewall does not have active drop UDP message. insufficient UDP buffer size

After the Linux system receives the message, it saves the message to the buffer. Because the size of the buffer is limited, if the UDP message is too large (exceeding the buffer size or MTU size), the rate of receiving the message is too fast, can cause Linux due to cache full and directly drop packets.

At the system level, Linux sets the maximum value that can be configured by receive buffer, which can be viewed in the file below, typically Linux sets an initial value based on memory size at startup. /proc/sys/net/core/rmem_max: Allowable receive buffer maximum/proc/sys/net/core/rmem_default: The default receive buffer value used/proc/sys/ Net/core/wmem_max: Allowable send buffer maximum/proc/sys/net/core/wmem_dafault: default use of Send buffer maximum value

However, these initial values are not intended to deal with the large flow of UDP packets, if the application to receive and send UDP packets are very large, you need to speak this value to increase. You can use the SYSCTL command to make it effective immediately:

Sysctl-w net.core.rmem_max=26214400 # set to 25M

You can also modify the corresponding parameters in/etc/sysctl.conf to keep the arguments in effect the next time you start.

If the message packet is too large, the data can be segmented on the sender to ensure the size of each message within the MTU.

Another configurable parameter is Netdev_max_backlog, which indicates the number of packets that can be cached by the Linux kernel after reading the message from the NIC driver, and the default is 1000, which can be adjusted to the value, such as setting to 2000:

sudo sysctl-w net.core.netdev_max_backlog=2000
High System load

System CPU, memory, IO load is likely to lead to network packet loss, such as CPU if the load is too high, the system does not have time to carry out the message of the checksum calculation, copy memory and other operations, resulting in a network card or socket buffer loss packet; Memory load too high, Application processing is too slow, unable to process the message in time, IO load is too high, the CPU is used to respond IO wait, there is no time to process the cache UDP packets.

The Linux system itself is an interconnected system, and any problem with any one component may affect the normal operation of other components. Too high a system load, either the application is problematic or the system is insufficient. For the former need to be found in time, Debug and repair, for the latter, but also timely detection and expansion. Apply lost Package

The above mentioned system UDP buffer size, the adjusted sysctl parameter is only the maximum allowable value of the system, each application in the creation of a socket needs to set its own socket buffer size value.

The Linux system puts the received message into the buffer of the socket, and the application reads the message continuously from the buffer. So here are two factors related to the application that will affect whether the packet is dropped: the size of the socket buffer and the speed at which the application reads the message.

For the first question, you can set the size of the socket receive buffer when the application initializes the socket, for example, the following code sets the socket buffer to 20MB:

uint64_t receive_buf_size = 20*1024*1024;  MB
setsockopt (socket_fd, Sol_socket, So_rcvbuf, &receive_buf_size, sizeof (receive_buf_size));

If you're not writing and maintaining your own programs, it's not even possible to modify the application code. Many applications provide configuration parameters to adjust this value, refer to the corresponding official documentation, and only issue the program's developers if no configuration parameters are available.

Obviously, increasing the application's receive buffer reduces the likelihood of losing packets, but at the same time causes the application to use more memory, so it needs to be used with caution.

Another factor is the speed at which messages are read in the buffer, and for applications, the processing of messages should be done asynchronously to where the packets are dropped .

To learn more about which function the Linux system is executing, you can use the Dropwatch tool, which listens for packet loss information and prints out the function address where the packet was dropped:

# dropwatch-l kas initalizing kallsyms db dropwatch> start enabling ...
Kernel monitoring activated. Issue ctrl-c 
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.