As an important technology of network virtualization, VXLAN has received much attention. How does this protocol work? How can we achieve SDN network through separation of data and control layer? How to deploy it? This blog will introduce you in detail...
I. Why Vxlan?
1. limits on the number of VLANs
4096 VLANs far cannot meet the needs of large-scale cloud computing data centers
2. Restrictions on physical network infrastructure
IP subnet-based Regional Division limits the deployment of application load requiring L2 network connectivity
3. TOR switch MAC table depletion
Virtualization and East-West traffic lead to more MAC table items
4. Multi-tenant scenarios
Overlapping IP addresses?
Ii. What is Vxlan?
1. Vxlan packets
Vxlan (virtual Extensible LAN) is an overlay network technology that uses MAC in UDP to encapsulate a packet header of 50 bytes. The specific message format is as follows:
(1) vxlan header
A total of 8 bytes. Currently, an 8bit identifier of Flags and a 24bit VNI (Vxlan Network identifier) are used. The rest are not defined, however, it must be set to 0x0000.
(2) Outer UDP Header
The target port uses 4798, but can be modified as needed. The colleague UDP checksum must be set to 0.
(3) IP packet header
The destination IP address can be a unicast or multicast address. In unicast scenarios, the destination IP address is the IP address of the Vxlan Tunnel End Point (VTEP. The VXLAN management layer is introduced in the case of multicast, and VTEPs is determined by the ing between VNI and IP multicast groups.
Protocol: set the value to 0x11. It indicates that this is a UDP packet.
Source ip: Source vTEP_IP;
Destination ip: Destination vtep ip.
(4) Ethernet Header
Destination Address: the Mac Address of the Destination VTEP, that is, the Address of the next hop (usually the gateway Mac Address );
VLAN: the VLAN Type is set to 0x8100, And the Vlan Id tag can be set (this is the vlan tag of the vxlan ).
Ethertype: set the value to 0x8000, indicating that the data packet is IPv4.
Supplement: what is the role of VTEP?
It is used to encapsulate/unencapsulate VXLAN packets, including arp request packets and normal VXLAN data packets. After a packet is encapsulated, the packet is sent to the VTEP at the other end through a tunnel, the VTEP at the other end receives the encapsulated packets and then installs them according to the encapsulated MAC address. VTEP can be implemented by hardware devices or software supporting VXLAN.
In terms of encapsulation structure, VXLAN provides the ability to overlay L2 Networks on L3 networks. The vni in the vxlan Header has 24 bits, and the number is far greater than 4096, in addition, UDP encapsulation can traverse a layer-3 network, providing better scalability than VLAN.
2. Vxlan data and control plane
(1) data plane-tunnel Mechanism
It is known that VTEP adds a packet header to the virtual machine's data packets. These new headers will be removed after the data reaches the target VTEP. The network device in the intermediate path only forwards data based on the destination address in the outer packet header. For the network in the forwarding path, a Vxlan packet is compared with a common IP packet, there is no difference when it comes to getting bigger.
As VXLAN data packets maintain the integrity of internal data throughout the forwarding process, the VXLAN data plane is a tunnel-based data plane.
(2) control plane-improved Layer 2 protocol
VXLAN does not maintain a persistent connection between virtual machines. Therefore, VXLAN requires a control plane to record the remote address accessibility. The control plane table is (VNI, inner MAC, outer vtep_ip ). The Vxlan learning address still stores the features of the layer-2 protocol. nodes do not periodically exchange their route tables. For unknown MAC addresses, VXLAN relies on multicast to obtain path information (if SDN Controller is available, it can be obtained from SDN unicast ).
On the other hand, VXLAN also has the self-learning function. When VTEP receives a UDP datagram, it checks whether it has received the data from this virtual machine. If not, VTEP records the correspondence between the source vni, source outer ip address, and source inner mac to avoid multicast learning.
Iii. VxlanARP request
(1) vxlan Initialization
VM1 and VM2 are connected to VXLAN Network (VNI) 100, and two VXLAN hosts are added to the IP multicast group 239.119.1.1
(2) ARP request
1) VM1 sends ARP requests in the form of broadcast;
2) VTEP1 encapsulates packets. The vxlan id is 100, the outer IP Address Header DA is an IP multicast group (239.119.1.1), and the SA is IP_VTEP1.
3) VTEP1 multicast in multicast groups;
4) VTEP2 parses and receives multiple broadcast files. Fill in the stream table (VNI, inner mac address, outer IP address), and the range marked as 100 in the local VXLAN
Broadcast (the application of VXLAN ).
5) Respond to ARP requests received by VM2;
(3) ARP response
1) VM2 prepares the ARP response packet and then sends the response packet to VM1
2) After VTEP2 receives the Response Message of VM2, it encapsulates it in the ip Unicast message (the vxlan id is still 100), and then sends unicast messages to vm1.
3) after receiving a single broadcast, VTEP1 learns the ing between the inner MAC address and the outer IP address, unencapsulates the IP address, and forwards the IP address to VM1 based on the target MAC address of the encapsulated content.
4) When VM1 receives the ARP response packet, ARP interaction ends.
Iv. Data Transmission
(1) After the ARP request is answered, VM1 knows the Mac address of VM2 and needs to communicate with VM2 (note that VM1 sends data to VM2 through TCP ).
VTEP1 receives the VM1 packet and uses the MAC address from the stream table to check whether VM1 and VM2 use a VNI. The two VMS are not only located in the same VNI (not in the same VNI outbound gateway), but also VTEP1 already knows all the address information of VM2 (MAC and VTEP2_IP ). VTEP1 encapsulates new data packets. Then hand it over to the uplink switch.
(2) the uplink switch receives a UDP packet from the server, compares the destination IP address with its route table, and forwards the datagram to the corresponding port.
(3) When the destination VTEP receives the data packet, it checks the VNI. If the VNI in UDP is the same as the VNI in VM2, it unpacks the data packet and sends it to VM2 for further processing. So far, a data packet has been transmitted. The entire Vxlan-related behavior (possibly crossing multiple gateways) is transparent to virtual machines, so they do not feel the transmission process. Although TCP is enabled between VM1 and VM2 to transmit data, the data packet is actually forwarded in the form of UDP. The VTEP at both ends does not check whether the data is correct or whether the data is complete, all this work is done after VM1 and VM2 receive the unencapsulated TCP packet. That is to say, if UDP encapsulates a TCP connection, UDP and TCP will work as two independent protocol stacks without interaction between them.
V. Vxlan Gateway
To connect a VXLAN network to a non-VXLAN network, you must use the VXLAN gateway to bridge the VXLAN network and the external network and map and route VXLAN IDs and VLAN IDs, VXLAN communication also requires the support of layer-3 devices, that is, VXLAN routing. VXLAN gateways can also be implemented by hardware and software. In terms of encapsulation structure, VXLAN provides the ability to overlay L2 Networks on L3 networks. The vni in the vxlan Header has 24 bits, and the number is far greater than 4096, in addition, UDP encapsulation can traverse a layer-3 network, providing better scalability than VLAN.
6. Deployment
(1) Pure VXLAN Deployment scenario
For virtual machines connected to the VXLAN, because the VLAN information of the virtual machine is no longer used as the basis for forwarding, the migration of virtual machines is no longer limited by the layer-3 gateway, and migration across the layer-3 gateway can be achieved.
(2) hybrid VXLAN and VLAN deployment
To achieve interconnection between VLANs and VXLAN, VXLAN defines VXLAN gateways. There are two types of ports on the VXLAN Gateway: VXLAN port and common port.
When receiving data from the VXLAN network to the normal network, the VXLAN gateway removes the outer packet header and forwards the data to the normal port based on the original frame header of the internal layer; when data enters the VXLAN network from a common network, the VXLAN gateway is responsible for marking the outer packet header, and corresponding to a VNI according to the original vlan id, while removing the vlan id information of the inner packet header. If the VXLAN gateway finds that the inner frame of a VXLAN package also has the original L2 vlan id, the package is directly discarded.
In this case, the vlan id is a local information and serves only on a local L2 network. VXLAN is a tunneling mechanism and does not rely on vlan id for forwarding, you cannot check whether the vlan id is correct. Therefore, an ACCESS port must be configured for the VXLAN gateway to connect to a port of a traditional network, and the TRUNK port cannot be enabled.