About PF_RING/Intel82599/transparent VPN

Source: Internet
Author: User
Tags macbook

Close to the verge of collapse. Today this article is conceived in the hospital, and I am ill again. I 'd rather drop a bottle and take no medicine, but cannot access the Internet with my notebook. I can't do anything, I want to know something. I can only use 3 GB and don't dare to open a hot spot. Because no one reimbursed me for the traffic, I only had one day this weekend. It rained and I had another night. After learning about PF_RING, I was eager to do an experiment, so I ran home for verification and then came back.
The cause is this.

A total of four problems

1. About a network accelerator card

A few days ago, I got into touch with a network accelerator card and inserted it into a PCIe slot. The card runs an independent Linux system and communicates with the host through PCIe. This card number is characteristic of packet capture performance? Yes, that is, it is suitable for DPI or transparent firewall. The technical staff of the manufacturer personally demonstrated the function of a firewall for us, that is, they cannot access the website of the excellent database, and can do anything else. I think it is interesting that their accelerator card has 20 thousand m optical ports, one connected to one PC that can access the Internet, and the other connected to the egress switch. In addition to being unable to access the optimal database on the PC, other interfaces can be accessed, while the 20 thousand m optical port is neither connected by the bridge nor an IP address. How can this card be implemented by one port into another, not a bridge, without an IP address, this is simply not a network device. What is it?
2. A recent company evaluation

Recently, a one-time evaluation of the company's equipment was conducted, and the results were up to standard. However, I personally felt that there was still room for improvement, but the evaluation incident occurred in Beijing. I had no experience but to regret it, in addition, the tester is not the kind of person who loves technology. Our people are always cooperating with him. The results can be imagined and can be easily understood, they will never let us use their fierce performance testing instruments as toys.
3. The company's NAS online behavior management

If I have browsed a website that is not very elegant in the company, I certainly don't want to record NAS. Otherwise, the administrators will be able to see it at any time and laugh at me behind the scenes. In fact, administrators have such preferences, after all, in the contemporary era, the power of network management is enormous for privacy. Later, I heard that this plug-in is connected to the main link. As a network-savvy and curious R & D engineer, I must want to know how this plug-in achieves layer-2 monitoring, if it is image data, it is really nothing new. The key is not a bypass device, it is indeed a series of devices, that is to say, the data packet actually completely passes its processing logic. To be honest, if it doesn't want you to access XXYY, not just monitoring, it can be used as a firewall (for centralized network management, if it is me, I will certainly open all the access, only monitoring, so that you can see more fun ). This requires the device to have a strong line speed, not the normal Linux can do, how it is achieved, how to achieve such high-performance forwarding, the key is that it is still a layer-2 device.
4. VPN transparency

This is a shame. When I implemented my VPN last year, I told the customer that our product was a two-layer device, so the customer was relieved because it would save a lot of maintenance time, because the L2 device does not need to configure network parameters, I think so, but he is wrong and I am also wrong. After implementation, I went home happily, in the next month or even a few months, I was harassed by countless "Routing Problems in Bridge Mode". So I once again impersonate an implementation engineer to explain to the customer, "We pull the traffic we are interested in to the third layer, and the traffic we are not interested in is directly passed through the second layer... "This poem is just like a face, because it clearly belongs to the post-event explanation. It's a barrier. Why didn't we say this in the first place? If the customer understands the technology, I think this is obvious...
Explanations of these problems

The above are several questions I have raised. For many people, these are not problems. In fact, they are not problems for me. But for many people, it is necessary to give them a clear explanation.
One-to-one device and PF_RING

The device or card mentioned in the above several questions, except for Question 2, is a one-in-one device. Question 2 is not such a device and can be considered as a server, it is not a forwarding device. The reason why it is also put in question is to talk about all aspects of performance tuning through it.
1. PF_RING uses mmap to place bare network data in a user-mode that can be accessed directly, rather than copying data through the memory of the socket read/write mechanism;
2. PF_RING supports the following three methods to put raw data to the user-state ring buffer using mmap and the 2.1 DNA mode:
2. 1. capture data packets from the netif_receive_skb function by using the PACKET socket method. This is a compatible method with the PACKET socket. The difference is that data packets no longer enter the user State through socket IO, but through mmap;


Unfortunately, the 2.4 method is not free to use. It provides free download based on its License, but a Test Library is provided in binary format. If you need to use it for a long time, you need to purchase the unlock code. It is quite pitiful, because people also need money to continue the study.
Behind PF_RING

Many people just think that PF_RING is only a high-performance packet capture mechanism, which provides the data packet Image Analysis on the local machine for network audit. This is just explained in the traditional way. Furthermore, the PF_RING mechanism subverts the way in which network intermediate nodes interpret data packets. According to the traditional concept, the intermediate network node can only parse data packets layer by layer at the protocol stack level. The so-called router is a layer-3 device, and the switch is a layer-2 device, firewalls are classified into two-layer firewalls and three-layer firewalls... the PF_RING device can directly DMA data packets from the NIC chip to the memory on your machine. That's all. Then, you can process data packets through an application instead of the kernel protocol stack, as for how your application processes data packets, I will list the following:
1. parse data packets in depth, parse sessions according to various granularities you can think of, and then record audit information;
 


PF_RING caters to the multi-queue feature of modern high-end NICs. This is the truth, but even if it does not support multi-queue NICs, if it is processed according to the current Linux kernel protocol stack, assuming that a specific CPU is interrupted (in fact, Linux's balance is not flattering), we assume that we use a network card that does not support interruption balancing, generally, the Soft Interrupt triggered will also be executed on the CPU. You have no way to do this, even if you have 8 core CPUs, what can you do? But for PF_RING, because you can correspond to multiple rings on a network card, you can place different streams on different rings, or even different packets on different rings, then, each Ring is processed by an application tied to a specific CPU. This actually pushes the so-called multi-queue to a level, as shown in:
 


For non-forwarding devices, such as an APP Server, that is to say, traffic is terminated locally, the future architecture may look like this, as shown in:
 


Looking at the figure above, you may ask, how does the device know that the data packet is sent to the local device? In fact, this is not the responsibility of this device. How do you know that what I DMA to the Ring buffer is an Ethernet frame instead of a pure HTTP packet? In short, there is no such thing as the present, which is incredible. What's best about PF_RING is not what it implements, but the mechanism that enables you to implement something. What can you implement in the PF_RING framework, it's totally limited by your imagination, and that's why everyone thinks PF_RING can only capture packets.
Traditional mistakes

This is the same as asking "how does an SDN Switch handle IP layer routing" in an SDN environment, I also asked the network accelerator card provider mentioned in question 1 for the so-called "user-mode protocol stack" because network protocols were handled in the so-called protocol stack in the past, so even if the processing of the network is moved to the user State, I still need a user-state protocol stack for peace of mind. The network can be processed in the user State or in the kernel state, the key is to have a stack so that the network can be processed normally. However, the answer I got is: If you need to develop your own protocol stack, we will fully cooperate and support it... god, do I have to study RFC and write code according to the rules such as mandatory and recommended? How can I process complex IP routing and TCP state machines? Processing TLS... isn't there a ready-made user-mode protocol stack?
Implement a transparent forwarding device

The above involves a lot of theories and insights, and some reasons for the high performance of PF_RING. In fact, many times we are more concerned about how to do this. That is to say, we are more concerned about interfaces and how to use it, in addition, there is no problem with the implementation of your trust in the underlying layer, and your in-depth understanding of the principle is basically invincible (but in reality a large number of people only meet the needs of the interface ). In this section, I try to implement a simple transparent forwarding device, that is, an expensive network cable. The simplest word means that it only has the forwarding function and a little bit of filtering function, the purpose is to display the use of interfaces.
The topology structure is very simple. My iMac turns off WIFI and the network cable connects to my Macbook. my Macbook connects to the router through WIFI. The Macbook has a built-in Linux Virtual Machine and adds two NICs in the same Bridge mode, one Bridge to wi-fi and one Bridge to Ethernet, as shown in:


 

If there is only one computer, this topology is really difficult to build. Don't tell me to use VMWare's LAN Segment. I hate that kind of thing.

The preceding topology configuration is as follows:
IMac-en0: 192.168.1.200/24 default gateway: 192.168.1.1
It's clear.
# Include <stdio. h> # include <stdlib. h> # include <pfring. h> # include <string. h> # include <getopt. h> int main (int argc, char * argv []) {pfring * pfring_net1, * pfring_net2; unsigned char * dev1 = NULL; unsigned char * dev2 = NULL; char c; struct option opts [] = {{. name = "net1 ",. has_arg = 1 ,. val = 'I '},{. name = "net2 ",. has_arg = 1 ,. val = 'O'}, {NULL}; while (c = getopt_long (argc, argv, "I: o:", op Ts, NULL ))! =-1) {switch (c) {case 'I': dev1 = strdup (optarg); break; case 'O': dev2 = strdup (optarg); break ;}} if (dev1 = NULL | dev2 = NULL) {goto end;} pfring_net1 = pfring_open (dev1, 1518, PF_RING_PROMISC); pfring_net2 = pfring_open (dev2, 1518, latency ); if (pfring_net1 = NULL | pfring_net2 = NULL) {goto end;} if (pfring_set_bpf_filter (pfring_net1, "arp or tcp or udp") {goto end ;} if (reverse (pfring_net1, rx_only_direction) | reverse (pfring_net2, forward) {goto end;} if (reverse (pfring_net1) | reverse (pfring_net2) {goto end ;} while (1) {unsigned char * pkt; struct returns ring_hdr; if (convert (pfring_net1, & pkt, 0, & ring_hdr, 0) {pfring_send (pfring_net2, pkt, success, success, 1);} if (pfring_recv (pfring_net2, & pkt, 0, & ring_hdr, 0) {pfring_send (pfring_net1, pkt, ring_hdr.caplen, 1);} end: if (pfring_net1) {pfring_close (pfring_net1);} if (pfring_net2) {pfring_close (pfring_net2);} return 0 ;}
Gcc test. c-o test-lpcap-lpfring-lrt
When pf_ring .v5.4.4.pdf, the resource score of a volume is 0! Compiled Program Execution./Test-I eth0-o eth1Then open the web page on iMac, completely OK, and ping Baidu? No! Why? Because this sentence:
Pfring_set_bpf_filter (pfring_net1, "arp or tcp or udp ")

Only arp, tcp, and udp are allowed. icmp is not allowed.
Modern Gigabit/10-Gigabit Ethernet cards-Intel 82599 as an Example

If NetLib is a proprietary solution of the Tilera architecture, PF_RING is the general solution corresponding to the x86 architecture. Intel 825.99 million m cards provide many new mechanisms and many new extensions for the old mechanism. Among all the new and old mechanisms, the most exciting thing is the refinement of Multi-queue, as shown in:

If the multi-queue is combined with PF_RING, it will be shown in the following figure:


Modern mengka software response-PF_RING DNA (Derict NIC Access)

The performance of NIC chips, CPU, chipset, and bus is greatly improved. What about the software? Unfortunately, the software seems a little tricky. However, PF_RING makes further concessions for you, that is, the only memory copy is saved, that is, data packets do not need to be copied from the NIC to the mmap to the user memory, instead, when the physical layer receives the packet, it directly places the packet to a specified place. Specifically, it is to map the memory on the NIC to somewhere in the address space. The virtual memory is really a good thing.
PF_RING-based VPN Device

I can answer the VPN question. If PF_RING is used, I will capture the data packet to the user State process, retrieve its MAC header, and use the IP header for backup to encrypt the entire IP data packet, then, use the obtained IP header, MAC header to re-encapsulate the encrypted data (transmission mode), or use a new IP header and a backup MAC to encapsulate the data (tunnel mode ), you can even encrypt the TCP/UDP load and keep the TCP/UDP header, my VPN device can achieve super flexible encryption and decryption without configuring an IP address, and it is really an expensive network cable.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.