Ing between virtual networks and real networks on Linux

Source: Internet
Author: User
Tags passthrough virtual environment
Network problems in a virtualized environment in a cloud computing environment that provides IaaS services, each user can obtain a virtual computer that runs in intensive mode in backend server clusters. A feature of virtual machines is to provide users with experience similar to physical machines, and reality... "/> <scripttype =" text/javascript "src =" http: // www.2cto network problems in the virtualization environment

In a cloud computing environment that provides IaaS services, each user can obtain a virtual computer that runs in intensive mode in the backend server cluster. A feature of virtual machines is to provide users with experience similar to physical machines, while physical machines in the real world can be networked through various network topologies. In a virtual environment, how to easily and quickly create the same network as in reality has become a new challenge.

Figure 1. physical network ING example

Figure 1 shows an example of a network ing problem. On the left side of the figure is a common network environment in the real world: four PCs form two subnets through their respective physical NICs. the PCs in the two subnets cannot communicate by default, in other words, they are physically isolated. The right side of Figure 1 shows the scenario in a virtualized environment. four virtual machines run on one physical host at the same time, and two subnets need to be divided and isolated like the real environment on the left side of figure 1. How can we achieve this, or how to easily create a network environment similar to that on the left of Figure 1, has become a problem that must be solved in virtualization.

Main methods for simulating networks in a virtualized environment

For ease of understanding, this article divides the methods of simulating a real network in a virtualized environment into two types: using traditional network technology or using virtual network extension technology. Traditional network technology mainly refers to the existing Ethernet networks in the real world before the popularization of virtualization technology, including traditional IP networks and 802.1Q VLAN networks, which have good support for their Linux systems, you can configure these Linux devices to simulate the real-world network. Virtualization network expansion technology mainly refers to the emerging network technologies to address the challenges posed by cloud computing and virtualization environments, including 802.1Qbg and 802.1Qbh networks.

Network Model description in the virtualization environment fig 2 network model description used in this article

Figure 2 network models used in this article

 

The network elements used in this article are listed for ease of reading. The left column in the figure indicates the network elements in the real world, they are computer terminals, L2 switches, routers, gateways, switches supporting 802.1Q VLANs, L3 switches, physical NICs, and switches supporting the Hairpin mode. The elements listed in the figure are virtual machines, Linux Bridge, Linux route table, Linux iptables, and Host. The brown dotted box indicates the Ethernet broadcast domain, and the black dotted box indicates the physical binding relationship. The right column is the network device model in Linux, are the TAP device, VETH device, MACVLAN device working in VEPA mode, MACVLAN device working in Bridge mode, MACVLAN device working in Passthrough mode, SRIOV virtual VF device, and VLAN device respectively., they will be described below.

Introduction to Network elements used on the Linux Host side using traditional network technologies

Linux uses the following device models: Bridge, TAP, VETH, and VLAN. A Bridge device is a kernel-based L2 data exchange device that functions similar to a second-level switch in the real world. A tap device is a point-to-point network device that works on a layer-2 protocol. Each TAP device has a corresponding Linux character device. the user program can perform read/write operations on the character device, data exchange with the Linux kernel network protocol stack is often used by simulators in a virtualized environment. A veth device is a paired point-to-point network device. data input from one segment changes the output direction from the other end, which is usually used to change the data direction or connect to other network devices. VLAN devices are a group of devices that appear in the parent-child relationship. they are part of the 802.1.Q VLAN technology in Linux and are mainly used to process 802.1.Q VLAN tags.

Figure 3. traditional Ethernet networks in the real world

A typical traditional Ethernet structure: five terminal machines connect to the access layer switch through their respective NICs, and the switch then connects to the second-level switch through the aggregation port, and then connects to the gateway router, the vro forwards data to the external network through NAT (Net Address Translate) to form a private network environment that is closed but can connect to the internet and only occupies one public IP Address. All terminals are under the same second-level switch. according to the Ethernet protocol, layer-2 broadcast packets will spread throughout the network, which constitutes a potential broadcast storm risk. Similar network structures are widely used in companies, communities, and home users.

Figure 4. Virtual network A_V0

Figure 4. Virtual network A_V0

 

This is A relatively accurate simulation of Network A in the case of virtualization. Four virtual machines are connected to the access layer Bridge device through the TAP device, the access layer Bridge device is connected to the second-level Bridge device through a pair of VETH devices, and the host is connected to the second-level Bridge device through a pair of VETH devices. The second-level Bridge device further uses the Linux route table, and the IP Tables form a data forwarding relationship with the physical network card, and finally connects to the external physical network. The elements in this figure correspond to almost one-to-one elements in network A. The Bridge is equivalent to A layer-2 Switch in the real world, and the VETH device is equivalent to A network cable connected to the Bridge. the virtual machine sees the same network as the physical machine of Network, the broadcast domain includes all virtual user terminals. However, in general, a virtual machine does not necessarily need a second-level Bridge. it only needs the data forwarding function. therefore, to improve efficiency, the virtual network configuration is changed to retain only the core functions.

Figure 5. Virtual network A_V1

Figure 5. Virtual network A_V1

 

For A common network configuration in A virtualized environment, the comparison network A_V0 has the following changes: no one-to-one ing of network A Saves the need for second-level Bridge and VETH devices. In this case, the virtual machine can still access the Internet through the virtual gateway, but the existence of level 2 Bridge cannot be detected. Because of the high efficiency, the first-level Bridge and NAT network is selected as the default virtual network of Libvirt. The Bridge device in the figure is always connected to a TAP device with a MAC of 52: xx, because the Bridge implementation in the Linux kernel has a defect: when the MAC of the added device is the smallest MAC, MAC learning interrupts Bridge's work. Therefore, you must create a device with a small MAC value 51: xx: xx bypasses this problem. Because there are two subnets (192.168.1.0 and 192.168.2.0) in the figure, two Bridge devices are used to distinguish two broadcast domains from Network, this is inevitable without the 802.1Q VLAN.

Simulate 802.1Q VLAN Ethernet

Among the popular virtualization technologies, the communication industry has developed 802.1Q VLAN standards to solve the problem of broadcast storm domains in complex network environments. The 802.1Q VLAN technology can be used to separate logical subnets from physical subnets. that is, terminals physically connected to the same switch can belong to different logical subnets, terminals in different logical subnets are isolated from each other, which solves the confusion of broadcast domains described above.

Figure 6. 802.1Q VLAN Ethernet B in the real world

Figure 6. 802.1Q VLAN Ethernet B in the real world

 

This is a real-world 802.1Q VLAN network. Six computer terminals access the network through a level-1 switch, which belongs to VLAN 10, VLAN 20, and VLAN 30. As an example, the switch on the left side of the figure does not support 802.1Q VLAN. as a result, the two terminals connected to the switch are in a broadcast domain, even though they belong to different subnets. In comparison, the switch on the right side of the figure supports 802.1Q VLAN. by correctly configuring and cutting the subnet broadcast domain, terminals belonging to different network segments are isolated. To connect to the Internet, a layer-3 switch that supports 802.1Q VLAN is required. when data is sent out, the VLAN Tag is stripped and the IP address is forwarded to the correct VLAN subnet when data is received. The router performs NAT translation based on the IP address to connect to the Internet.

Figure 7. Virtual network B _V0

Figure 7. Virtual network B _V0

 

Network B can be accurately simulated under virtualization conditions. The six virtual machines will see the same network environment as the real PC in Network B. The Bridge, VLAN Device, and physical Nic on Host C jointly implement the first-level switch function that supports 802.1Q VLAN in Network B, thus isolating the logical subnet. The Bridge on Host B only serves to connect the physical Nic to the virtual machine. The Bridge on Host A is equivalent to A common switch, and there is A broadcast domain crossover problem like Network B.

Figure 8. Virtual network B _V1

Figure 8. Virtual network B _V1

 

By introducing VLAN devices on Host A and Host B, the broadcast domain crossover problem in B _V0 is solved, and virtual machines can use isolated subnets correctly. In most cases, virtual machines do not care about the network conditions of the above part of the Bridge. they only need to correctly isolate the logical subnet, and they can run on the same Host. Therefore, the network transformation is often simplified.

Figure 9. Virtual network B _V2

Indicates an 802.1Q VLAN network that is frequently used in virtualization. For all virtual machines, they are in the same logical subnet as network B. due to the introduction of VLAN devices, the crossover problem between VLAN 10 and broadcast domains of VLAN 20 in B is avoided. When multiple virtual machines need to access the same VLAN, they only need to use one Bridge to expand, instead of using a multi-level switch for data aggregation as a switch in the real world, because Bridge has almost infinite ports used to connect to other devices, there is no limit on the number of physical ports. The physical Nic outputs data with VLAN tags. like Network B, a layer-3 switch that supports 802.1Q VLAN is required for processing.

Use virtual network extension technology to simulate standard-side extension technology of real-world network

To address the complex network problems in cloud computing, two major extension technical standards are proposed in the industry: 802.1Qbg and 802.1Qbh. 802.1Qbh Bridge Port Extension is mainly proposed by Vmware and Cisco. it attempts to provide a complete virtualization network solution from the access layer to the aggregation layer to achieve the goal of defining a controllable network by software. It expands the traditional network protocol, so it requires support from new network devices, and the cost is high. 802.1Qbg Edge Virtual Bridging (EVB) was proposed by companies such as HP to try to improve the software simulated network by using existing devices at a lower cost. This article focuses on the latter.

One core concept of 802.1Qbg is VEPA (Virtual Ethernet Port Aggregator). In short, it uses Port aggregation and data classification forwarding, transfers the network processing work originally performed by the CPU and software on the Host to the first-level switch to reduce the Host CPU load. at the same time, it is possible to monitor the network traffic of the virtual machine on the first-level switch, in this way, the work scope of servers and network devices is clearly divided to facilitate system management.

Figure 10. VEPA concept diagram (from the hp vepa seminar in, which was slightly modified)

It shows the basic concept of VEPA: on a physical terminal, that is, the Host running on a virtual machine, a device needs to group virtual ports according to certain rules, complete the Port Group function ). At the same time, this device can abstract the ports that are divided into a group and ship the data of the same group of ports together to complete the Port Aggregation function ). The figure shows the data flow in the virtual port: All data from the virtual port will be aggregated with the same group of data and then delivered to the adjacent level-1 switch. the physical terminal no longer performs layer-2 protocol parsing. Communication between virtual ports on the same physical terminal must also be forwarded back through a level-1 switch, rather than being routed inside the physical terminal, which will increase the traffic load of some level-1 switches. The advantage of doing so is that the network processing tasks return to the dedicated network device end, and the network flow of all virtual machines becomes transparent to the network devices, so that the network administrator can use the dedicated network devices for control, it is no longer involved with the Host Server. Note that the VEPA mode can only be used on the first-level switch at the Access Layer. two VEPA devices cannot exist in the network at the same time, which is called edge virtualization.

Figure 11.802.1Qbg summary (from the hp vepa seminar in, which was slightly modified)

Technical Summary for 802.1Qbg: 802.1Qbg is also evolving. VEB (Virtual Ethernet Bridge) indicates the first-level data exchange function module accessed by Virtual machines, in Linux, we can use the Host data exchange function provided by the Bridge device. Tag-less VEPA is the VEPA data flow export mode described above. Because the communication protocols are not modified in these two modes, the existing devices can be used at a low cost, in VEPA mode, you only need to refresh the existing switch program so that it supports the Hairpin mode to complete data return. As a long-term solution, 802.1Qbg plans to support the VN-tagged mode, that is, extended communication protocols use new tags to mark data. Like 802.1Qbh, this will require new hardware support and increase costs.

Linux Host-side extension technology

To support the new virtual network technology, Linux introduces the new network device model MACVTAP. The implementation of MACVTAP is based on the traditional MACVLAN. Like a TAP device, each MACVTAP device has a corresponding Linux character device and the same IOCTL interface as the TAP device. Therefore, it can be directly used by KVM/Qemu, it facilitates network data exchange. The objective of introducing MACVTAP devices is to simplify the switching network in the virtualization environment, replace the traditional Linux TAP device and Bridge device combinations, and support new virtualization network technologies, such as 802.1 Qbg.

Figure 12 principle of Linux MACVTAP device

Figure 12 principle of Linux MACVTAP device

 

The principle of MACVTAP is explained in detail. The MACVTAP device is similar to a VLAN device and appears in a one-to-many parent-child relationship. Multiple MACVTAP sub-devices can be created on one parent device. one MACVTAP device has only one parent device. the MACVTAP sub-device can be used as the parent device, and the MACVTAP sub-device can be nested again. The parent and child devices are implicitly bridging. the parent device is equivalent to the TRUNK port of the switch in the real world. In fact, when the MACVTAP device is created and the mode is not Passthrough, the kernel implicitly creates a MACVLAN network to complete the forwarding function. MACVTAP devices have four working modes: Bridge, VEPA, Private, and Passthrough. In Bridge mode, it performs similar functions as Bridge devices. data can be exchanged and forwarded between sub-devices of the same parent device. a virtual machine is equivalent to simply accessing a vSwitch. The current Linux implementation has a defect. In this mode, MACVTAP sub-devices cannot communicate with Linux hosts, that is, virtual machines cannot communicate with hosts, but traditional Bridge devices are used, you can set an IP address for Bridge. The VEPA mode is a software implementation of the VEPA mechanism in the 802.1Qbg standard. the MACVTAP device in this mode simply forwards data to the parent device to complete the data aggregation function, generally, an external switch must support the Hairpin mode to work properly. The Private mode is similar to the VEPA mode. The difference is that the sub-MACVTAP is isolated from each other. In Passthrough mode, the MACVLAN data processing logic of the kernel is skipped, and the hardware determines how the data is processed, thus releasing the Host CPU resources.

The figure shows a situation where VF is used by MACVTAP devices when a Single Root I/O Virtualization network device exists. The VF device is a virtual network card that supports the SR-IOV physical network card, each virtual network card can be used as a real network card, the virtual network card is isolated from each other, thus sharing the hardware resources.

Figure 13 MACVTAP Passthrough and PCI Passthrough

Figure 13 MACVTAP Passthrough and PCI Passthrough

 

MACVTAP Passthrough is different from PCI Passthrough. PCI Passthrough is intended for any PCI device, not necessarily a network device. the objective is to allow the Guest OS to directly use the PCI hardware on the Host to improve efficiency. Taking the X86 platform as an example, data is directly transferred from Guest OS to Host hardware through VT-D technology that requires hardware support. This process is highly efficient, but because the simulator loses control over the virtual hardware, it is difficult to synchronize the hardware status on different hosts. Therefore, dynamic migration cannot be performed when PCI Passthrough is used. MACVTAP Passthrough only targets MACVTAP network devices. it aims to get rid of some software processing processes of MACVTAP in the kernel and hand it over to hardware. Under virtualization conditions, the data will still arrive at the I/O layer of the simulator and then be forwarded to the hardware. There is a loss of efficiency in doing so, but the simulator still controls the status of the virtual hardware and the trend of data, and can perform dynamic migration. To sum up, an SRIOV network device can be used in two modes: MACVTAP Passthrough and PCI Passthrough, depending on the user's choice of efficiency and functionality.

Simulation of traditional Ethernet 14. Virtual network A_M0

This figure shows how the MACVTAP device is used to simulate the real network. To reduce the legend, MACVTAP in the figure can work in the Bridge or VEPA mode, and the curve represents the data flow direction in the two modes respectively. When working in Bridge mode, data cannot flow from the virtual machine to the host Linux user program. This restriction does not apply when working in VEPA mode, but the first-level switch must work in Hairpin mode. This virtual network maps to Network A, but the broadcast domain is still chaotic because the virtual ports are not grouped. As mentioned above, Linux MACVTAP device working in VEPA mode only implements the data aggregation function. Compared with A_V1, MACVTAP replaces the combination of TAP and Bridge devices. This network does not use the routing and IP Tables of the host Linux system. these tasks are re-undertaken by external physical network devices. this is also one of the goals of 802.1Qbg technology, that is to say, professional network devices are responsible for network data processing tasks. Therefore, MACVTAP devices cannot use the affiliated network services of the host Linux system like A_V1.

Figure 15. Virtual network B _M0

Figure 15. Virtual network B _M0

 

The MACVTAP device is used to simulate network B accurately. Linux Bridge devices and MACVTAP devices working in Bridge mode can be regarded as software implementations of the 802.1Qbg VEB concept. Linux VLAN devices are added to Host C. According to the VEPA standard, VLAN tags can be used to group data and virtual ports. On Host C, MACVTAP devices working in the VEPA mode complete the aggregation function, and VLAN devices complete the grouping function. The combination of the two forms a complete software implementation for the VEPA technology, in this way, the logical subnet of the virtual machine on Host C is correctly isolated. Host C is combined with a level-1 switch working in Hairpin mode to export the network data of all virtual machines on Host C to the network device side for control. Readers can compare the network B _V0 to find out which devices are replaced by MACVTAP devices. Similar to B _V0 in the network, Host A still has the broadcast domain crossover problem because no VLAN device is introduced.

Figure 16 virtual network B _M1

Figure 16 virtual network B _M1

 

The VLAN and MACVTAP are introduced on Host A and Host B to solve the broadcast domain problem in B _M0, which is similar to that in B _V1.

Figure 17. Virtual network B _M2

Figure 17. Virtual network B _M2

 

Placing all virtual machines on the same Host further saves hardware resources and forms an environment similar to B _V2. As you can see, the data trend in this environment is slightly different: the communication between virtual machines must be through an external switch and cannot be completed internally like B _V2.

Figure 18. Virtual network B _M3

Figure 18. Virtual network B _M3

 

This is a network configuration using MACVTAP Passthrough technology. it further reduces the Host CPU load and improves efficiency without affecting the virtual machine dynamic migration function.

Summary

The network in the virtualization environment seems complex. its essence is to create a network structure similar to that in the real world for the virtual client. This article describes in detail the structure and significance of the virtual network on Linux, according to the principles in this article, users can use the Bridge, VLAN, and MACVTAP devices of Linux software to customize virtual networks similar to those in the real world at no cost, you can also create an upgraded virtual network based on the VEPA model in 802.1Qbg at a very low cost to generate virtual machine network traffic and reduce the Host server load. When a network card that supports SRIOV exists, Passthrough technology can be used to reduce Host load.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.