LibPcap packet loss

Source: Internet
Author: User
LibPcap packet loss problem-Linux Enterprise Application-Linux server application information. The following is a detailed description. Author: Yu Zhu

During this time, we checked the high packet loss rate of LibPcap. Many people mentioned this on the internet, but they always suspect that their problems are different from those of others.

Environment Description: Snapgear-3.5.0/kernel: linux-2.6.x/uClibc/Module: XSCALE/Intel IXP400/LibPcap-0.9.2/Snort-2.6.1.1

Test process: first set the board to the transparent bridge mode, and then let the Snort work in the logging mode (snort? A none-N), and then run Chariot TCP/High_Performance from eth1 (PC1)-> eth2 (PC2). The average speed is about 93 Mbps, And the Snort is interrupted after the entire script is run, displays Dropped: ≈ 86%. The packet loss rate was so terrible that I had to embark on the investigation journey.

Go to snapgear/user/snort/src and open. c. Find the Dropped source DropStats () and find that "Snort sorted ed" and "Dropped" are both obtained through pcap_stats (), So I think something is a bit bad.

For more information about LibPcap Packet loss, see Improving Passive Packet Capture: Beyond Device Polling (which can be found at http://luca.ntop.org. But what did the pioneers talk about? No, I have to check it out.

Then I commented out the pcap_setfilter () in snapgear/user/snort/src/snort. c/OpenPcap (). test again and the results are the same. So I asked snapgear/user/snort/src/snort. c/PcapProcessPacket () to return directly, and then tested the results without changing. I'm disappointed. Do I have to go to LibPcap? No way. Let's take a look.

Go to snapgear/lib/libpcap/all the way to find, and finally found that pcap_stats () chain in the following pcap-linux.c pcap_stats_linux (), read the following comments, and then debugging OK, day ah, do you want me to see the kernel? "Place after death, place after death," I have already taken this path.

Without thinking too much, follow the comments directly to the full text of the "tp_drops", grabbed it in snapgear/linux-2.6.x/net/packet/af_packet.c packet_rcv. We suspect the problem is:

If (atomic_read (& sk-> sk_rmem_alloc) + skb-> truesize> =
(Unsigned) sk-> sk_rcvbuf)
Goto drop_n_acct;

Debugging proves the correctness of the doubt and finds that sk_rmem_alloc will suddenly drop to zero. Why is sk_rmem_alloc insufficient? Therefore, I have to figure out how sk_rmem_alloc is released under normal circumstances. Atomic_read () Damn atomic operations, I have to thank it, because when I checked it, I found its brother atomic_sub () and finally found sock_rfree, debugging proves that sk_rmem_alloc is indeed released by this adult. When will this talent show up? I really don't know much about Linux!

Just because you have little knowledge, it is easy to find a lot of surprises: Day, so many inline functions have been defined in the header file. Sock_rfree () is mounted on skb-> destructor through static inline void skb_set_owner_r (struct sk_buff * skb, struct sock * sk) In snapgear/linux-2.6.x/include/net/sock. h. By looking for the destructor in the most clumsy way, we finally fixed _ kfree_skb () and stepped on the shortest FULCRUM kfree_skb (). facts have proved that, the consequences of being smart by stupid people are often miserable-the lovely kfree_skb () is everywhere. What should I do? I even regretted Diving too deeply. Calm down and find a new breakthrough.

Start with pcap_open_live () to see how the handle gets and how the socket is created. When I met socket (), I rushed into the kernel again, but I found a prototype without socket (). I am confused again-frankly, I did not know about the system call before. Look for the materials, and he is also the 9-jin. I really appreciate this Eldest Brother. I would like to recommend his Forum http://www.skynet.org.cn/here /. His "Linux kernel exploration" section describes socket. Snapgear/linux-2.6.x/net/socket. sys_socketcall () in c is the entrance to All socket-related system calls. This file defines many socket system calls. Here I also found sys_socket () and confirm that the socket created in LibPcap is implemented through this function. When I found _ sock_create (), I found that it was really sad. At half past one, I couldn't understand it. I turned my head.

Since pcap_open_live () is too deep, I will continue to break through pcap_dispatch. Tracing pcap_read_packet () in snapgear/lib/libpcap/pcap-linux.c found that the package was obtained through recvfrom () before callback () called the user program. Depressed, the prototype cannot be found, and the system call is performed. Thanks again to the author of "UDP Socket Creation" who read their article, sk-> sk_prot-> recvmsg is locked. Recvmsg is found everywhere, and then the type SOCK_RAW, snapgear/linux-2.6.x/net/ipv4/raw is selected when the Socket is created based on LibPcap. raw_recvmsg () in c is in phase because its hometown struct proto raw_prot [] is in static struct inet_protosw inetsw_array [] in the nest snapgear/linux-2.6.x/net/ipv4/af_inet.c. the inet_dgram_ops.recvmsg pointed by ops is exactly the same as sock_common_recvmsg. Cheers-happy too early, debugging confirmed season I'm disappointed, snapgear/linux-2.6.x/net/socket. when c sys_recvfrom () calls sock_recvmsg () to call _ sock_recvmsg (), sock-> ops-> recvmsg is not equal to sock_common_recvmsg in many cases, and a fog suddenly rises-day!

I watched packet_rcv () deeply (). I can not find a better breakthrough, take recvmsg as a life-saving straw, again search for recvmsg, finally, finally found in snapgear/linux-2.6.x/net/packet/af_packet.c. recvmsg = packet_recvmsg. Debugging: print the function address. OK! Even better, it was found in packet_recvmsg () that the final exit skb_free_datagram (), and in snapgear/linux-2.6.x/net/core/datate.c it showed it directly returned kfree_skb (). Confirm Debugging!

At this point, the entrance to the LibPcap packet capture has been found. Previously, it is nothing more than a demonstration of what I 've done while searching for these two doors and the stupid mistakes I 've made, the purpose is to warn me not to share the same mistakes as I do not know Linux, but also hope that the majority of experts will not be enlightened.

Conclusion: LibPcap uses the pcap_open_live () system to call socket () to create a socket. the system calls socket () to find sys_socket ()-> sock_create ()->__ sock_create ()-> rcu_dereference (net_families [family]) through sys_socketcall (). execute create according to the protocol cluster. The protocol cluster PF_PACKET used by LibPcap calls snapgear/linux-2.6.x/net/socket through packet_init () in af_packet.c. sock_register () in c is initialized and registered into net_families. create = packet create. Therefore, LibPcap eventually calls packet_create () to create a socket, creates sk in packet_create (), and has sock-> ops = & packet_ops; po-> prot_hook.func = packet_rcv; static const struct proto_ops packet_ops.recvmsg = packet_recvmsg, which is the entry for the user program to obtain data packets from the socket through LibPcap. Therefore, the whole process for a user program to obtain data packets through LibPcap can be simply described as: packet_rcv () receives packets from the underlying layer (I have not figured out the specific location ), and allocate a buffer. When the sk_receive_queue resource is insufficient to accommodate the next data segment, kfree_skb () is directly discarded and the ps_drop obtained through pcap_stats () is recorded ); the user program calls packet_recvmsg () from time to obtain data from the queue at a time, and finally releases the resource skb_free_datagram ().

In fact, I have not explained the topic yet. What causes LibPcap packet loss? After learning about the packet capture process of LibPcap, we are not so confused. debugging finds that the execution frequency of packet_recvmsg () is much less than packet_rcv (), so in packet_rcv () after receiving data and filling sk_rmem_alloc, packet_recvmsg () cannot be cleared in time. In this time difference, only packet loss is allowed. So why is the frequency of packet_recvmsg () Execution insufficient? This may be a lower-level problem. I cannot explain it here because of limited capabilities.

Let's talk about how to solve this problem. I don't know the underlying reasons, so I can only make adjustments to what I know-increase sk_rmem_alloc and use its space to accommodate the active actions of packet_rcv, but this is at the cost of sacrifice. In my testing environment, enable all the rule provided in snort, extend sk_rmem_alloc to 10 M (echo 10485760>/proc/sys/net/core/rmem_default & echo 10485760>/proc/sys/net/core/rmem_max) to ensure Dropped: 0.00%, but the average speed is reduced to ≈ 16 Mbps.

Conclusion: This article is my note on this issue and has taken many detours. If you are interested in this issue, take yourself as a negative material, I hope readers can criticize and correct the mistakes in this article. Since we have taken so many detours, of course we have wasted a lot of valuable time. Thank you very much for your great help and patience. These are the reasons why I decided to write this article.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.