[Switch] virtual network devices in UOS

Source: Internet
Author: User

With the development of network technology and virtualization technology, more and more advanced network devices have been added to Linux. These devices play a wide and critical role in UOS, including Open vswitch, tap device, and veth device. It is important for us to maintain and understand the relationship between these devices and how they work.

Next we will explain the flow of data packets in UOS in this scenario (assuming that the reader has some knowledge of openstack/UOS, it is best to practice the official openstack installation manual or read the UOS User Manual): One tenant, two networks, one route, the internal network using GRE, libvirt VIF driver using libvirthybridovsbridgedriver.

We have an Internet (external network) with the IP segment 42.62.73.52/16 (the Public IP address assigned to us by UOS), two Intranets, testsubnet: 192.168.0.0/24, and productivesubnet: 10.10.0.0/16, it is worth noting that this is two subnets (subnet ).

In this scenario, the internal model of the computing node should be as follows:

 

Next I will explain how to get this image. First, let's take a look at the name of our virtual machine in libvirt. Using the NOVA show command, we can get the output like this (screenshot part ):

+--------------------------------------+-------------------------------| Property                             | Value                         |+--------------------------------------+-------------------------------| Internal network                     | 10.18.0.3, 172.16.19.232      || OS-DCF:diskConfig                    | MANUAL                        || OS-EXT-AZ:availability_zone          | nova                          || OS-EXT-SRV-ATTR:host                 | compute1                      || OS-EXT-SRV-ATTR:hypervisor_hostname  | compute1                      || OS-EXT-SRV-ATTR:instance_name        | instance-0000001e             |

We see that this virtual machine is deployed on the compute1 node, instance_name is the instance-0000001e, and we use the virsh dumpxml on the compute1 node to print out the instance-0000001e information (Intercept Network-related ):

+--------------------------------------+-------------------------------| Property                             | Value                         |+--------------------------------------+-------------------------------| Internal network                     | 10.18.0.3, 172.16.19.232      || OS-DCF:diskConfig                    | MANUAL                        || OS-EXT-AZ:availability_zone          | nova                          || OS-EXT-SRV-ATTR:host                 | compute1                      || OS-EXT-SRV-ATTR:hypervisor_hostname  | compute1                      || OS-EXT-SRV-ATTR:instance_name        | instance-0000001e             |

Here we see that the network device for this virtual machine is a tap48e06cd2-60, and it seems to be connected to the qbr48e06cd2-60, let's look at it with brctl show (capture the relevant part ):

bridge name        bridge id            STP enabled     interfacesqbr48e06cd2-60     8000.bed5536ff312no              qvb48e06cd2-60                                                        tap48e06cd2-60

Here we see two interfaces, qbr48e06cd2-60 and qvb48e06cd2-60 on the bridge tap48e06cd2-60, where the tap device is the virtual network device we use for our virtual machine, what is the qvb48e06cd2-60? We use lshw-class network to print out all network devices (Part 1 ):

*-network:5       description: Ethernet interface       physical id: 7       logical name: qvb48e06cd2-60       serial: be:d5:53:6f:f3:12       size: 10Gbit/s       capabilities: ethernet physical       configuration: autonegotiation=off broadcast=yes driver=veth driverversion=1.0 duplex=full firmware=N/A link=yes multicast=yes port=twisted pair promiscuous=yes speed=10Gbit/s

We noticed that the driver of this device is veth, and veth always appears in pairs. We use ethtool-s to check that the other end of this veth is connected to it:

# ethtool -S qvb48e06cd2-60NIC statistics:     peer_ifindex: 16

OK. Check the device on the 16th and IP link ):

16: qvo48e06cd2-60: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000    link/ether aa:c0:0f:d2:e2:43 brd ff:ff:ff:ff:ff:ff

Through the above two steps, we have learned the process from virtual machine network devices to veth pair. This process has different simple descriptions for different libvirt VIF drivers in the official documentation, see https://wiki.openstack.org/wiki/lib?vifdrivers. The following should be connected to the open vswitch. Let's verify it:

# ovs-vsctl show1910d375-2692-4214-acdf-d364382c25a4    Bridge br-int        Port br-int            Interface br-int                type: internal        Port patch-tun            Interface patch-tun                type: patch                options: {peer=patch-int}        Port "qvo48e06cd2-60"            tag: 1            Interface "qvo48e06cd2-60"        Port "qvodfdc29e2-9a"            tag: 2            Interface "qvodfdc29e2-9a"        Port "qvo18cec000-80"            tag: 2            Interface "qvo18cec000-80"        Port "qvob86d15f1-8f"            tag: 1            Interface "qvob86d15f1-8f"    Bridge br-tun        Port br-tun            Interface br-tun                type: internal        Port patch-int            Interface patch-int                type: patch                options: {peer=patch-tun}        Port "gre-1"            Interface "gre-1"                type: gre                options: {in_key=flow, local_ip="192.168.10.11", out_key=flow, remote_ip="192.168.10.10"}    ovs_version: "1.11.0"

Sure enough, the qvo48e06cd2-60 is connected to the Br-int, openstack uses such a complex mechanism, rather than directly connecting the tap device to the open vswitch, which is related to the security group, you can view the official documentation.

Before studying the ovs, we should first note that there is a "tag: 1" under the Port "qvo48e06cd2-60", this tag is open vswitch to distinguish different subnets. Here, tag1 indicates our 10.18.0.0/24 subnet, and tag2 indicates the 10.22.22.0/24 subnet.

The Br-int and Br-tun are connected through patch. Patch Is Not described much in the official documents. However, once the two ovs bridges are connected through a bridge, the two bridges will be almost the same bridge, for more information, see open vswitch FAQ and connecting ovs bridges with Patch ports.

First, let's take a look at the stream table rules of Bt-int:

16: qvo48e06cd2-60: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000    link/ether aa:c0:0f:d2:e2:43 brd ff:ff:ff:ff:ff:ff

There is only one normal action, which is explained in the open vswitch official document as switching packets in a traditional, non-openflow way, that is to say, the effect is the same as that of openflow rules (see open vswitch advanced features tutorial ). Then we analyze the flow table rules of Br-tun. First, we can use ovs-ofctl dump-ports-Desc on the computing node to view all interfaces on Br-tun:

OFPST_PORT_DESC reply (xid=0x2): 1(patch-int): addr:ea:a2:71:f5:9f:ad     config:     0     state:      0     speed: 0 Mbps now, 0 Mbps max 2(gre-1): addr:d6:89:b0:03:d2:72     config:     0     state:      0     speed: 0 Mbps now, 0 Mbps max LOCAL(br-tun): addr:9a:49:9a:35:d1:4e     config:     0     state:      0     speed: 0 Mbps now, 0 Mbps max

Then use ovs-ofctl dump-flows or easyovs to view the stream table rules of Br-Tun (here easyovs is used to make the layout look better ):

ID TAB PKT       PRI   MATCH                                                       ACT0  0   339       1     in=1                                                        resubmit(,1)1  0   285       1     in=2                                                        resubmit(,2)2  0   3         0     *                                                           drop3  1   216       0     dl_dst=00:00:00:00:00:00/01:00:00:00:00:00              resubmit(,20)4  1   123       0     dl_dst=01:00:00:00:00:00/01:00:00:00:00:00              resubmit(,21)5  10  363       1     *                                                           learn(table=20,hard_timeout=300,priority=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:16  2   341       1     tun_id=0x2                                             mod_vlan_vid:1,resubmit(,10)7  2   17        1     tun_id=0x3                                             mod_vlan_vid:2,resubmit(,10)8  2   3         0     *                                                           drop9  20  0         0     *                                                           resubmit(,21)10 21  3         1     vlan=2                                          strip_vlan,set_tunnel:0x3,output:211 21  16        1     vlan=1                                          strip_vlan,set_tunnel:0x2,output:212 21  4         0     *                                                            drop13 3   0         0     *                                                            drop

Here, only IDs, table names, counters, matching rules, and behaviors are clearly displayed. Let's take a look at these streams: 0, 3, 4, 9, 10, 11, and 12. These streams define the behavior of the packet that enters from the Br-int, one by one from the top down:

0. table 0: When a packet enters from Port 1 (patch-int), it is submitted to table 1 for further matching; 3. table 1: When the target MAC address is a unicast address, it is submitted to table 20 for further matching. 4. table 1: When the target MAC address is a multicast or broadcast address, it is submitted to table 21 for further matching;, 9. table 20: Submit to 21 to continue matching (this table is not just forwarding. When ovs dynamically creates an automatic learning rule based on table 10, it is added to table 20, for example, the following flow table rule automatically creates a route rule for the target MAC address: "Cookie = 0 × 0, duration = 11.099 S, table = 20, n_packets = 45, n_bytes = 6132, hard_timeout = 300, idle_age = 3, hard_age = 2, priority = 1, vlan_tci = 0 × 0001/0 x0fff, dl_dst = FA: 16: 3E: A1: 3f: 19 actions = Load: 0-> nxm_of_vlan_tci [], load: 0 × 2-> nxm_nx_tun_id [], output: 2 "); 10. table 21: when the target VLAN label is 2, the VLAN label is removed and the tunnel key is set to 3 (the GRE channel key. For details, see the rfc2890 description) and sent out from Port 2 (gre-1); 11. table 21: when the target VLAN label is 1, the VLAN label is removed, the tunnel key is set to 2 and sent from Port 2 (gre-1); 12. table 21: discard packets that fail to match.

Let's look at 1, 6, 7, and 5. These streams define the behavior of packets from the GRE channel (network node:

1. table 0: Submit to table 2 to continue matching when matched to the package that enters from Port 2 (gre-1); 6. table 2: when the tunnel key is 2, add VLAN tag 1 and submit it to table 10 for further matching. table 2: when the tunnel key is 3, add VLAN tag 2 and submit it to table 10 for further matching. table 10: first, learn VLAN, Mac, and other information from the message, add the rule to table 20, and then send the rule from Port 1 (patch-int.

Now, the data path analysis of the virtual machine has been basically completed.

42.62.73.52 this article from https://www.ustack.com/2014/06/17/virtual-device-in-uos/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.