Linux implements Loopback-based NVI (NAT Virtual Interface)

Last Update:2013-11-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Loopback is actually a hole

But if it is not a hole, it can do something like Cisco's NVI. Since "if it is not a hole" in the early stage, you need to modify the code. Before modification, you must understand why the Linux loopback interface is a hole.

The standard stipulates that all data packets attempting to go to other places (not local) through the loopback interface should be discarded. Linux uses loop hole to achieve this. Linux's loopback traffic limit is in the local range. All loopback traffic must be sent by the local machine, so it will be set to loopback_dst in ip_output, then, when the incoming IP address receiving routine is executed, it already has associated route items and will not query the route table. Therefore, no data packet entering the ip_input logic is sent by the local machine, therefore, we can make a rough internal judgment. If the source address is the local address, it will be discarded! In this way, the package sent from the local machine will not go through the loopback port and then to the outside. Next, let's see if the package that comes in externally can go to the outside through the loopback port. The answer is no doubt. Let's look at the following process: data Packets enter from the physical network adapter-> are routed to the lo port-> associate the loopback_dst route entry to the data packet-> loopback interface xmit data packet-> simulate the loopback interface to receive data packets-> enter the ip_input route judgment-> because a route entry already exists, it is forwarded according to the route entry. There are two forwarding methods for Route entries. For external incoming packets, ip_forward is continuously called until the TTL is 0. Therefore, as long as you enter the loopback, you can directly discard the loopback, or enter the crazy loop.
Next I will discuss how to break these constraints. First, let's talk about how the data packets sent from the local machine first go out through loopback, and then explain how the data packets sent from external sources first go out through loopback. Finally, Let's explain, what problems will be encountered when using NAT and how to solve the NAT problem in combination with the above two scenarios: Local packet sending and external packet sending.
1. The local package is sent via loopback
It is unnecessary to modify the code because I am undermining the principle. Fortunately, the code is just a little bit modified. The modified part is to identify the package "sent elsewhere through loopback" and then delete its associated route entry. It is easier to use Netfilter on PREROUTING. In addition, the Local route that represents the Local address is deleted from the Local table, and then added to the main table as the unicast route. In this way, when performing reverse route query, it will not match the route to the Local table (Linux requires that the type of the reverse route must be unicast), so it is OK!
2. External packet forwarding through loopback
In this case, as long as the loopback route entry Association of the data packet is deleted, the data packet can be smoothly forwarded. Because the source IP address of a data packet cannot be the Local IP address, it cannot be a Local IP address. If the data stream wants to return the original IP address, it must have a reverse unicast route.
3. NAT Problems
When SNAT is configured, it depends on the SNAT address. If SANT becomes the local address, it will face the above 1st problems, the solution is to delete the address from the Local table, but deleting the address will cause arp attacks on other machines, therefore, after the IP address is deleted, you must explicitly ping the arp update of the IP address. If SNAT is an IP address, reverse accessibility is involved, because the next hop does not necessarily know the accessibility of the IP address.
4. NAT solution
The NAT problem only exists when SNAT is changed to another address. There are two situations. The first case is that SNAT is an unrelated address of other network segments, in this way, you only need to configure the route to this address in the next hop to ensure that the reverse packet of the data stream can return to this BOX. This routing configuration can be manually configured in a simple environment, in a complex environment, you can use dynamic routing to advertise the SNAT address. The second case is that the SNAT address is in the same network segment as the next hop, this will cause the SNAT address to become the destination address when the reverse packet of the data stream is returned to the next hop. Because the SNAT address is in the same network segment, it will be directly ARP. Therefore, you need to add an ARP conversion rule:
Arptables-t mangle-a output-d next hop gateway address-j mangle -- mangle-ip-s SNAT address
Now that you know the problem and solution, you can start it. The goal of this article is to implement something similar to Cisco NVI, that is, a virtual network card, and implement NAT in the transmission process of the virtual network card. Considering that there is something as good as loopback, I will no longer write a virtual network card. It is better to directly use loopback to simulate one. The general process is as follows:
The data packet enters from the physical NIC-> Execute DNAT-> route to loopback-> execute SNAT-> Loopback port-> Policy Routing-> Physical network card issuing
We can see that the route is executed twice. The first is for NAT and the second is for the real route.
In addition to loopback, writing a veth-like virtual network card is a better choice:

Veth stands for Virtual ETHernet. It is a simple tunnel driver that works at the link layer and looks like a pair of ethernet devices interconnected with each other.

Better than loopback, this basically does not modify the code to implement NVI, and it is easy to get the original incoming interface of the data packet. The logic of the driver is very simple, that is, a pair contains a primary interface and a secondary interface. data packets enter the secondary interface that is routed to the primary interface from the primary interface. Note, the receiving interface of skb is not changed. This so-called route is only used to "receive the action sent to another Nic from the physical nic". At this time, PREROUTING/POSTROUTING has been completed, the real route can be sent from another main interface.
This time, I am not in a rush to write my own virtual network card. I will try loopback first. Now let's get started!
1. modify the code:
Re-encapsulate the NF_INET_PRE_ROUTING hook function of the RAW table and call the following logic before calling ipt_hook:

// The judgment is a little reckless. Normally, you can design an if (skb-> dev-> flags & IFF_LOOPBACK & skb-> nfct) that matches the algorithm) {skb-> nfct = & gt; skb-> nfctinfo = IP_CT_NEW; nf_conntrack_get (skb-> nfct); skb_dst_drop (skb); return NF_ACCEPT ;}

This Code indicates that, if a data packet enters from the physical Nic, it is obviously necessary to match with the application rules (such as NAT). If this is done, data is imported to the loopback interface through a route, so do not use conntrack again. However, the nfct of skb may have been set, so the NOTRACK, and discard the route cache of skb. In Linux, IP routing treats loopback in this way. If the result exit of the route query is the loopback interface, dst is directly set. The loopback xmit sends data packets and calls a netif_rx to receive them again, when you arrive at ip_rcv_finish, you do not have to query the route because you already have dst. However, in this case, our second route query-in fact, the query of the policy route cannot be implemented, so we must drop the original dst.
2. Configure the IP address and Netfilter Policy
IP Address:
Link/ether 00: 0c: 29: 90: 66: c5 brd ff: ff
Inet 192.168.2.249/24 brd 192.168.2.255 scope global eth2
4: eth2: <BROADCAST, MULTICAST, UP, LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
Link/ether 00: 0c: 29: 90: 66: cf brd ff: ff
Inet 192.168.40.249/24 scope global eth2
Note: Two NICs are currently used. eth1 is connected internally and eth2 is connected externally.

NAT table:
Note: All the data packet source IP addresses initiated within the system are converted to the IP address of the cost machine.
RAW table:
-A prerouting-I lo-j MARK -- set-xmark 100
Note: The entry from the lo port indicates that the NAT caused by the first route query has been completed. MARK the entry so that the subsequent routing logic can identify it as the second real route query.

Policy:

32764: from all fwmark 0x64 lookup loop
32766: from all lookup main
32767: from all lookup default
Note: This is a policy that matches a FWMARK. This MARK is marked by a RAW table to identify whether it is the first route query or the second route query, the first route query is used to implement NAT and all conntrack-related operations. The second query is used to implement real IP routing.

Master route:
Note: if there is nothing in the master route table, a default route is sent through the loopback interface. As mentioned above, this route query is the first "NAT-related" route query to get a packet through a PRE-ROUTING and POST-ROUTING.

Policy route table loop:
192.168.2.0/24 dev eth1 scope link src 192.168.2.249
192.168.40.0/24 dev eth2 scope link
Default via 192.168.40.254 dev eth2
Note: I moved all the content in the main route table to the policy route table, including the directly connected link layer route. Because I want my address to exist only as a secondary IP address for IP routing and NAT, even directly connected hosts cannot be issued through direct connection routing, because no direct connection routing exists in the main table, therefore, the direct connection traffic also enters the loopback to complete the original route trip. If a 100 packet is tagged, the loop route table is queried. As described above, this route query is the second real route query ". You can use source for policy routing for this query, because the source address conversion has been completed.

Local route table:
Local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
Local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
Note: The addresses involved in SNAT should be deleted from the Local table. Because IP routing does not allow packets initiated from the local machine to be sent to other places through the loopback interface, if SNAT converts the source address of the transmitted data packet to the IP address of the Local Machine, during reverse route verification, it will fail. In details, it is still caused by Linux's loopback traffic. Because the Linux protocol stack sets the input route entry during output, it will not reach ip_route_input, therefore, loopback traffic is not allowed to be forwarding. However, removing an IP address from a physical Nic from the Local table has a side effect, that is, the ARP logic will no longer reply to ARP requests for these IP addresses, because removing an address from a Local table is equivalent to giving up ownership of the address. However, there are many solutions to this "Next Hop Resolution Protocol" problem, such as static settings, such as dedicated configuration of a proxy, such as arping.

Defects:Although NVI functions are implemented in this implementation, what are the disadvantages. For example, if a data packet passes through the loopback interface, the original information of the inbound interface will be lost. Complex conntrack rules are required to map the data packet to IP-FWMARK.

In this example, the IP addresses configured on the Intranet and Internet ports are completely used for routing and no longer belong to the local machine. You cannot expect these IP addresses to access the BOX itself, the Local routes marked by these addresses have been deleted from the Local table, so they are no longer marked as hosts. In addition, this BOX does not reply to ARP requests that are configured on the IP address of the physical Nic, linux determines whether to reply to ARP requests based on the premise that the IP address represents the route in the Local table.

Once again, we can see how freely you can modify and customize Linux, and even destroy some accepted principles, but the premise is that you must know what the consequences will be after these principles are destroyed. You have sufficient reasons to do so and must do so. Maybe you have not found a better way, you can only do this now. In any case, there will always be a lot of gains for doing something free!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux implements Loopback-based NVI (NAT Virtual Interface)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support