Several virtual NICs related to Linux Network virtualization-VETH/MACVLAN/MACVTAP/LAN
Linux Nic drivers contain many "virtual NICs ". Earlier articles have analyzed in detail virtual network adapters such as tun and ifb. Similar ideas: Under the trend of virtualization, the Linux source code tree is constantly increasing support for "Network virtualization, not only to support the "virtual machine" technology, but also to give users and programmers more choices.
These support technologies for network virtualization include any heavyweight virtualization technology, heavy ones such as support for virtual machine technology, and lightweight ones such as net namespace technology. My recent work is based on the net namespace technology. I will not talk much about this technology. It mainly provides an independent protocol stack and Nic for each namespace, all namespaces are shared for the network protocol stack and those outside the NIC. This lightweight network-based virtualization technology is especially useful for simulating multi-client network connections and is easy to operate. I will write an article separately to demonstrate this operation.
If I only want to complete the work, I will not write this article. As early as last year, I wrote an article about net namespace, based on the step by step in it, the work can be completed, and we have done the work from the end of last year to the beginning of this year. However, this is not the case for learning. I know that many people know that learning is not as easy as going to school. We don't have a whole piece of time to study systematically, this is especially true for passers-by C, who is married and has children and needs to pay back the loan but is no longer willful. Therefore, we need to have a sense of mutual sorrow and regret over the technologies we encounter, so that we have the motivation to thoroughly understand them.
In this article, I want to introduce several common types of virtual NICs related to network Virtualization in Linux through several images. Of course, the use scenarios of these virtual NICs are not limited to net namespace, heavyweight virtual machines can also be used. net namespace is used for example because of its simplicity. In general, the principles of these virtual NICs are there. The specific scenarios of using them depend on your imagination.
Network Virtualization in general, the so-called network Virtualization in this article refers to the network virtualization of the host, focusing on the separation of multiple TCP/IP protocol stacks in a physical host. Network virtualization can be implemented independently or by other technologies. In Linux, the independent network virtualization implementation is the net namespace technology, and the network virtualization implemented by other technologies is the virtual machine technology. Of course, we know that each virtual machine has its own protocol stack, however, network virtualization based on virtual machine technology may be simpler, because the host machine does not need to "IMPLEMENT" A protocol stack, instead, the task is handed over to the virtual machine's operating system. The host machine "believes" that the virtual machine must run an operating system with a protocol stack.
To understand the purpose of a virtual network card, you need to know that a network card is a door and an interface, which is generally connected to the protocol stack and the following is generally connected to the media. The most important thing is that you need to know exactly what they are above and below.
Since interfaces on NICs are implemented in OS or in user mode using PF technology, all in all, they are soft, which means you can implement them at will. On the contrary, the lower interface is not controlled by the running software of the machine. You cannot change the twisted pair through the software, right? Therefore, we generally focus on what is connected to and what is connected to the NIC? Call it endpoint. Before starting the body, I will list several common endpoints:
Ethernet ETHx: Common twisted pair wires or optical fiber cables;
TUN/TAP: A character device that can be operated by a file handle;
IFB: One redirection operation to the original Nic;
VETH: Triggers the RX of the peer of the virtual network card;
VTI: Encryption engine;
...
There are many ways to route data between the host Nic and the virtual NIC (generalized Routing). In the early kernel, bridge (Linux bridge is also a virtual Nic) the support is implemented by a br_handle_frame_hook hook called by hard encoding in netif_receive_skb. This hook is registered by the bridge module. However, as there are more and more types of virtual network cards, there is always no hard-coded hook for each type. This will make netif_receive_skb too bloated, so a new method is proposed, in fact, it is very easy to abstract this hook to an upper layer. Instead of hard encoding, it calls the only rx_handler hook in netif_receive_skb. How to set this hook depends on the type of virtual network card to be bound to the host network card, for example:
For bridge: Call netdev_rx_handler_register (dev, br_handle_frame, p). br_handle_frame is called in netif_receive_skb;
For bonding: Call netdev_rx_handler_register (slave_dev, bond_handle_frame, new_slave). bond_handle_frame is called in netif_receive_skb;
For MACVLAN: Call netdev_rx_handler_register (dev, macvlan_handle_frame, port). macvlan_handle_frame is called in netif_receive_skb;
For LAN: Call netdev_rx_handler_register (dev, zhanglan_handle_frame, port) and call javaslan_handle_frame in netif_receive_skb;
For...
Each host Nic can only register one rx_handler, but the nic and Nic can be superimposed.
VETH Virtual Network Card Technology about this virtual network card, I mentioned in "OpenVPN multi-processing-netns container and iptables CLUSTER" that each VETH network card is a pair of Ethernet cards, except for the xmit interface and the conventional ethernet card driver, the other interface is almost a standard Ethernet Card. Since the VETH Nic is a pair of two, we call one as the peer of the other, which is also said in the standard. The implementation of its xmit is to send data to its peer and trigger its peer RX. So the question is, how can the data be sent out of the VETH Nic? You must answer your questions as follows:
1. If you really need to send data to the outside, bridge a VETH Nic and a common ETHx Nic, and forward the data to ETHx through the bridge logic;
2. Do I have to send data packets to the outside? Isn't it just self-reception like loopback? With VETH, data packets can be sent from one net namespace to another net namespace on the same machine in a simple and confidential manner without being sniffed.
VETH virtual network card is very simple, the schematic is as follows:
VETH uses the original simple method to connect different net namespaces in a UNIX-style manner. Therefore, you need to use many other technologies or tools to isolate net namespaces and send data.
MACVLAN Virtual Network Card Technology MACVLAN technology is an extremely simple solution to virtualize an Ethernet card into multiple Ethernet cards. An Ethernet card must have a MAC address, which is the core of the Ethernet Card.
In the past, we can only add multiple IP addresses for one ethernet card, but cannot add multiple MAC addresses, because the MAC address is identified by its global uniqueness, even if you create ethx: y, you will find that the MAC addresses of all these "NICs" are the same as those of ethx. In essence, they are still a network card, this will limit you to perform many layer-2 operations. With MACVLAN technology, you can do this.
Let's take a look at the process of MACVLAN technology:
In specific execution, you can create a MACVLAN Nic through the following command, which is virtualized Based on eth0:
Ip link add link eth0 name macv1 type macvlan
You can think that someone splits the twisted pair wires physically into two parts and connects them to two NICs, one of which is a virtual MACVLAN Nic. But since shared media does not need to run CSMA/CD? Of course not, because in fact, the final data is sent through eth0, while the full duplex mode of the modern ethernet card works as long as it is a full duplex switch (in some standards, this is required), and eth0 can be done by itself.
Now let's talk about the vnic mode built by MACVLAN technology. The reason why MACVLAN has a so-called mode is that, compared with VETH, it builds complexity on an Ethernet concept that cannot accommodate anything, so there are too many elements to interact with each other, the relationship between them is different, resulting in different behavior of MACVLAN. Or graphic method:
1. bridge Mode
This bridge is only applicable to the communication behavior between the MACVLAN Nic of the same host Ethernet Card and the host Nic, and has nothing to do with external communication. The so-called bridge means that data streams can be directly forwarded between these NICs without external assistance. This is a bit similar to the built-in bridge in Linux BOX, that is, everything you do with the brctl command.
2. VEPA Mode
I will talk about the VEPA model later. Now, in VEPA mode, even if both MACVLANeth1 and MACVLANeth2 are configured on eth0, the communication between them cannot be directly implemented, it must be assisted by an external switch connected to eth0. This is usually a switch that supports "Hairpin bend" forwarding.
3. private Mode
This private mode is more isolated than VEPA. In private mode, even if both MACVLANeth1 and MACVLANeth2 are configured on eth0, eth0 is connected to the external switch S, and S supports the "Hairpin bend" forwarding mode, the broadcast/multicast traffic of MACVLANeth1 cannot reach MACVLANeth2, and vice versa. The broadcast traffic is isolated because Ethernet is broadcast-based and broadcast-isolated, and Ethernet will lose support.
If you want to configure the MACVLAN mode, add the mode parameter after the ip link command:
Ip link add link eth0 name macv1 type macvlan mode bridge | vepa | private
Differences between a veth nic and a macvlan Nic Let's first look at how to configure an independent net namespace.
1. VETH Mode
Ip netns add ns1
Ip link add v1 type veth peer name veth1
Ip link set v1 netns ns1
Brctl addbr br0
Brctl addif br0 eth0
Brctl addif br0 veth1
Ifconfig br0 192.168.0.1/16
2. MACVLAN
Ip link add link eth0 name macv1 type macvlan
Ip link set macv1 netns ns1
As you can see, MACVLAN is simpler than VETH. So efficiency? Linux bridge is implemented based on software and requires constant search for hash tables. This is also the MACVLAN bridge Mode. However, both VEPA and private modes directly forward data. Their differences can be shown as follows:
What is VEPA technology? Virtual Ethernet Port Aggregator. It is HP's technology against Cisco's VN-Tag in the virtualization support field. Therefore, Cisco's VN-Tag and VEPA aim to solve the same problem or the same type of problem. What problems are solved? In layman's terms, it is the problem of network communication between virtual machines, especially the network communication between virtual machines located in the same host machine.
Isn't this problem solved? I use VMWare to create multiple virtual machines in my PC. Even if I unplug my PC network cable, these virtual machines can communicate with each other... there is a vSwitch in VMWare. That is to say, almost all Virtual Machine technologies and built-in cross networks can solve the communication problem between virtual machines. So what should we do with VN-Tag and VEPA?
This issue involves two areas: scalability and responsibility boundaries. What's more, is the built-in vSwitch and other things sufficient to meet the requirements in terms of performance and functionality? It is an edge product of Virtual Machine software vendors. It is not even an independent product. It is generally offered as a subsidiary of Virtual Machine Software and does not have its own sales profit model, virtual Machine manufacturers built it in because it is only to allow users to experience the ability to communicate with each other between virtual machines, so the manufacturers will not try to make this built-in virtual switch or virtual router perfect, they push Virtual Machine Software.
In addition, for thousands of years, the responsibility boundaries between network administrators and system administrators have been clear until the era of Virtualization has reached. If a built-in vswitch is used, who should I find if the switch fails or has a complicated configuration task plan? You need to know that this virtual switch is built into the host server, which is the domain of the system administrator. The general network management settings cannot touch these devices, the complex three-Permission discrete management mode of the data center cannot allow the network administrator to log on to the server. In turn, the system administrator's perception of network protocols is far inferior to that of professional network administrators. This has led to the embarrassing situation of virtual network devices built into the virtual machine software. On the other hand, this virtual network device is indeed not a very professional network device. Explosion!
Cisco is worthy of being a big internet player. It is always the first to propose a standard in this embarrassing scenario, so it transformed the Ethernet protocol and launched the VN-Tag, just as ISL was in 802.1Q. VN-Tag adds a brand new field to the standard protocol header, provided that Cisco has the ability to launch a device and run it as quickly as possible. When we look at HP's counterattack, HP does not have the capabilities of Cisco, and it does not modify the protocol header, but it can modify the Protocol behavior to solve the problem, although it is a step later than Cisco, however, the VEPA proposed by HP is a more open way, and Linux can easily increase its support.
VEPA, It is very simple, a data packet enters from a network port of a switch, and then sends back from the same network port, it seems meaningless, but it does not change the Ethernet protocol header. This method is meaningless in general, because normally, a network card is connected to a network cable. If the data is sent to itself, the data will not reach the network card, for Linux, the loopback directly gives the bypass. However, in virtualization scenarios, the situation is different. Although the physical host may have an Ethernet Card, the packets sent from the NIC may not necessarily come from the same protocol stack, it may come from different virtual machines or different net namespaces (for Linux only), because within the OS that supports virtualization, a physical Nic is virtualized into multiple virtual NICs, each virtual network card belongs to one virtual machine... in this case, if you do not modify the Ethernet protocol header and do not have a built-in vswitch, an external switch is required to assist in forwarding. Typically, the packet is received from an exchange port, the host Nic determines whether to receive the port and how to receive it. As shown in:
For the ethernet card, there is no need to modify the hardware, and the software driver can be modified. For the switch, there is very little to modify, as long as the MAC/Port ing Table query fails, it is enough to broadcast data packets to all ports including the entry. The STP protocol is also similar. For HP, releasing VEPA is the right option because it can generate a large number of NICs and devices to control hardware standards, unlike Cisco and Intel. Vswitches that support VEPA only need to support a "Hairpin bend" mode. Explosion!
In this section, we will take a look at IPVLAN. After understanding the MACVLAN, it is very easy to understand the LAN. The difference between an ingress LAN and a MACVLAN is that it separates traffic on the IP layer rather than based on MAC addresses. Therefore, you can see that, the MAC addresses of all LAN virtual NICs belong to the same host, because the host Ethernet NIC does not use the MAC address to divert traffic from the LAN virtual Nic. The specific process is shown in:
The command for creating a LAN is as follows:
Ip link add link Type multicast LAN mode {l2 | L3}
The method for placing an ingress LAN virtual network card into an independent net namespace is the same as that for MACVLAN. But how can we choose between them? Fortunately, lan has a Document on the Linux source code tree, so I will not talk about it:
4.1 L2 mode: In this mode TX processing happens on the stack instance attached to the slave device and packets are switched and queued to the master device to send out. in this mode the slaves will RX/TX multicast and broadcast (if applicable) as well.
4.2 L3 mode: in this mode TX processing upto L3 happens on the stack instance attached to the slave device and packets are switched to the stack instance of the master device for the L2 processing and routing from that instance will be used before packets are queued on the outbound device. in this mode the slaves will not receive nor can send multicast/broadcast traffic.
5. What to choose (macvlan vs. vlan lan )? These two devices are very similar in your regards and the specific use case cocould very well define which device to choose. if one of the following situations defines your use case then you can choose to use your lan-() the Linux host that is connected to the external switch/router has policy configured that allows only one mac per port. (B) No of virtual devices created on a master exceed the mac capacity and puts the NIC in promiscous mode and degraded performance is a concern. (c) If the slave device is to be put into the hostile/untrusted network namespace where L2 on the slave cocould be changed/misused.
MACVTAP virtual network card technology is the last virtual network card mentioned in this article. Why is there such a virtual network card? Let's start with the problem.
If a virtual machine or simulator is implemented in user mode, how does one simulate the NIC when running the OS? Or we have implemented a user-mode protocol stack, which is completely independent from the kernel protocol stack. You can think of them as two net namespaces, in this case, how do I route the traffic of the physical Nic to the user State? Or, in turn, how to route the data from the user-mode protocol stack to the outside of the BOX? According to the general idea, we know that the endpoint of the TAP Nic is a character device that can be accessed by the user mode. OpenVPN uses it, and many lightweight user mode protocol stacks also use it, we will provide the following solutions:
We need to use the "omnipotent bridge ". How troublesome it is, how sad it is.
Just as MACVLAN replaces VETH + Bridge, MACVLAN can replace TAP + Bridge. It is very easy to modify rx_handler. After the ethernet card of the host receives the packet, the protocol stack connected by the interface on the virtual network card that is not handed over to the MACVLAN is sent to a character device queue. That's MACVTAP!
Unfortunately, the multi-queue TUN/TAP virtual network card technology was developed by lazy in 2014. In fact, it only made some porting and modification work. However, after MACVTAP was found, my version was instantly cracked. Sorry! Xin Xiang, Jin Xiang, has been CHEN Ji.