In traditional Data Center Server area network design, the L2 network is usually limited to the network convergence layer. By configuring VLANs for cross-aggregation switches, the L2 network can be extended to multiple access switches. This solution provides flexible scalability for the server network access layer. In recent years, high-availability cluster technology and dynamic migration technology of virtual servers (such as VMware's VMotion) have been widely used in data center Disaster Tolerance and computing resource allocation, these two technologies require not only large L2 network access within the data center, but also a wide range of L2 network expansion between data centers. 1. Business Requirements for layer-2 interconnection between data centers 1. The following three interconnection links are usually deployed between data centers. Each interconnection link carries different data and achieves different functions, these three links are logically isolated from each other. L L3 network interconnection. Also known as data center frontend network interconnection, the so-called "frontend network" refers to the egress of the data center for enterprise campus network or enterprise wide area network. Frontend networks of different data centers (primary centers and disaster recovery centers) are interconnected through IP technology, and clients of the campus or branch access data centers through the frontend network. When a disaster occurs in the primary data center, the front-end network is quickly converged, and the client accesses the disaster recovery center to ensure business continuity. Www.2cto.com l L2 network interconnection. Also known as the network interconnection of data center servers. On the server network access layer of different data centers, a large L2 Network (VLAN) is built across data centers to meet the requirements for L2 network access in scenarios such as server clusters or dynamic migration of virtual machines. L SAN interconnection. Also known as backend storage network interconnection. With the help of Transmission Technology (DWDM, SDH, etc.), data is replicated between the master center and the disaster recovery center on the disk array. Figure 1. Three interconnection modes of data centers two-layer network interconnection scenarios are as follows. 1.1 high-availability Cluster server Cluster is a logical server that uses Cluster software to associate multiple servers on the network to provide consistent services. Cluster software of most vendors (such as HP, IBM, Microsoft, and Veritas) requires layer-2 network interconnection between servers. Deploy servers in clusters in different data centers to achieve cross-Data Center application system disaster tolerance. Two L2 Networks are usually required for Server Cluster Interconnection, as shown in figure 2. Figure 2. high-availability cluster of the server cluster l inter-process communication (heartbeat and session synchronization) between software processes of the cluster, used to maintain and control the status of the master node, IP addresses on this network are not published through the data center frontend network. Www.2cto.com l Public Communication Network of the cluster, that is, the access network of the virtual IP address of the cluster. A virtual IP address is the address that the cluster provides external services. It is similar to the virtual IP address used to configure vrrp on the vro. The address will be published through the front-end network of the data center. 1.2 migration of servers and virtual machines during the expansion or relocation of data centers requires the migration of physical servers from one data center to another. In this process, two layers of interconnected networks need to be built between data centers, taking into account the following two factors. l when the server is migrated to the new data center, if a L2 network between the new and old centers is not built, the server IP address of the new center is re-planned. You also need to modify the DNS or the IP address of the server configured by the client application. Therefore, building a cross-center L2 network can retain the IP addresses of the migrated servers, thus simplifying the migration process. L during the server relocation period, only some servers in the server group can be migrated to the new center within a given period of time. To ensure business continuity, a cross-center server cluster should be established, therefore, building a cross-center L2 interconnected network can achieve smooth server migration. Similar to server relocation, virtual machine migration is used ". Currently, some server virtualization software can dynamically migrate virtual machines between two virtualized physical servers, as shown in figure 3. The VM migrated to another center not only retains the original IP address, but also maintains the running status before migration (such as the TCP session status ), therefore, physical servers involved in Virtual Machine migration must be connected to the same L2 network (the gateway of the virtual machine remains unchanged before and after migration). This application scenario requires building a cross-center L2 network. Figure 3. virtual Machine Dynamic migration 2 key to layer-2 interconnection between data centers-End-to-End Loops avoid layer-2 interconnection between data centers, in order to improve the overall high availability, the data center interconnection link Redundancy must be ensured, which also brings about a problem-Layer 2 loop. Therefore, it is necessary to prevent the failure of all data center server access networks from being caused by a layer 2 loop of a remote data center. Technologies for eliminating L2 loops include www.2cto.com l enabling Spanning Tree Protocol (STP) on the access layer network ). Eliminate L2 loops through vswitch virtualization technology (such as H3C IRF. 2.1 key points of the design of Spanning Tree Protocol when the STP domain is connected across multiple data centers through wide area links, the failure of a center may cause network shock of multiple centers. Therefore, we recommend that you restrict the STP domain within each data center when considering the L2 interconnection STP deployment of the data center. As shown in deployment method 4, The BPDU Interception Function and STP function can be enabled on the port of the L2 interconnected switch of the data center for the interconnection link. Figure 4. isolating STP from the data center. This isolation and division of STP domains simplifies the management and maintenance of L2 interconnection in the data center, even when the L2 network of a data center changes, it does not cause instability of the L2 network caused by STP computing. 2.2 key design points of vswitch virtualization technology (such as H3C IRF and Cisco VSS) is to connect multiple vswitches to form a virtual logical device. You can manage this virtual device to manage all member devices. Because the vswitch system has been used as a device, the ports of different physical devices can be bundled together as a logical port, the configuration and networking are also exactly the same as the port aggregation function of a single device. This feature is called "cross-device link aggregation ". For the access layer of the data center server, two access switches are usually used to access the servers of the same business system to meet the uplink requirements of the server's dual Nic. This topology usually uses VLAN cross-aggregation configuration, and enable STP to avoid L2 loops. When vswitch virtualization is used to simplify the network, virtualization integration of the network aggregation layer is necessary because this is the key network layer for eliminating STP. There are two methods for the access layer network, as shown in Figure 5. the vswitch eliminates access layer loop l Mode A and keeps the original network topology and device independence unchanged. It virtualizes the aggregation layer and directly binds the two uplink links of the access switch to eliminate the loop, the server Nic belongs to two independent switches. L Method B: add an IRF interconnected cable between the two access switches, so that the access layer can also be virtualized and integrated. The two switches connected by the dual Nic of the server can be virtualized into one, all uplink cables of the two switches can be bundled across devices to further reduce the number of logical links. Regardless of method A or method B, the virtualization of the aggregation layer is necessary. In the various solutions discussed below, we recommend that you implement the vswitch virtualization technology at the network aggregation layer. 3. Two-layer interconnection scheme between data centers explore the two-layer interconnection scheme of vswitches carrying bare optical fiber. This paper uses H3C IRF as an example to describe the application of the two-layer interconnection scheme between data centers, readers can also apply this design idea to other manufacturers' vswitch technologies. As shown in figure 6, there are four data centers distributed in different locations. The Centers form a ring-shaped optical transmission network through the DWDM device, and the distance between adjacent data centers is less than 100 kilometers. The internal network of each data center is divided into the access layer, aggregation layer, and core layer. The Convergence layer device uses IRF virtualization, so the convergence layer of each center is logically only a switch. Figure 6. to achieve layer-2 interconnection between the four data centers, an interconnected node connecting the four centers at the same time must be built. The Node consists of two physical switches that support IRF, logically, it is a vswitch. For high availability, the two physical switches should be deployed in different data centers (A and B), as shown in figure 7. The interconnection nodes and the aggregation layer of each center are bundled with each other in a dual-link manner. Logically, the four centers and the interconnected nodes form a Hub-Spoke star topology. the interconnected nodes are hubs and the aggregation layer of each center is Spoke. In addition, A special wavelength λ must be divided between the center and the B center on the DWDM device as the interconnection link of the two IRF member devices on the interconnection node, for IRF virtualization, the interconnection link must be a 10GE interface. Figure 7. topological relationship between nodes and centers Figure 8 shows the connection relationship between the aggregation layer of each site and the DWDM optical transmission channel between nodes. The interconnection node contains two physical devices A and B. device A is deployed in the center, and Device B is deployed in the B center. There are two links from the center to the interconnected node, one of which is the optical fiber in the center, and the other is carried on the wavelength λ allocated by the interconnected DWDM devices between the and B centers. The connection method between the B center and the connected node is similar to the above. For the C and D centers, the two links from the aggregation layer to the aggregation node carry the wavelength λ assigned by the DWDM device. In turn, the layer-2 interconnection of the four centers consumes 8 DWDM wavelength λ. Two λ (must carry 10GE) are used for IRF interconnection, and the other six λ are used for the link (one in center A and one in center B) from each center to the interconnected node, C and D each have two links, and center A and center B each use local optical fiber resources ). Figure 8. the physical connection relationship between the aggregation layer and the interconnected nodes is www.2cto.com, as described in chapter "Key Points of vswitch virtualization technology design". For the internal data center, in order to achieve high-availability access of the dual-homing server, we recommend that you deploy IRF virtualization technology on the access layer. The Virtualized access switch connects to the aggregation switch through multi-link bundling. On the one hand, it can completely avoid second-layer loops and make full use of the uplink bandwidth of the access layer. Note that STP should be enabled even in L2 Networks without loops to prevent errors in device configuration and physical link maintenance. At the same time, although the network does not have a loop, STP may still affect the L2 network across multiple data centers. For example, when the network topology changes, the tcn bpdu may be extended beyond the data center through a L2 network. Therefore, it is necessary to control the range of STP domains. For enable STP aggregation switches, BPDU should be enabled on their ports for interconnection nodes to intercept and disable STP computing, the STP domains in each data center are separated. 3.2 VLL or VPLS scheme for MPLS networks 3.2.1 solution description the MPLS networks discussed here include the MPLS networks built by the enterprise or provided to the enterprise by the operator, the two-layer interconnection of data centers is the same: a pair of data center interconnection devices (PES) Initiate a two-layer VPN to expand the two-layer network (VLAN) between data centers. Pay attention to the end-to-end loop to avoid problems. Two implementation technologies are available for the MPLS network-based data center L2 interconnection solution: l mpls vll (Mpls Virtual Leased Line): To achieve point-to-point L2 interconnection between two data centers. L VPLS (Virtual Private LAN Service): Implements layer-2 interconnection between multiple data centers and multiple points. Mpls vll transparently transmits user layer 2 data on the MPLS network. From the user's point of view, this MPLS network is a layer-2 switching network through which layer-2 connections can be established between different sites. VLL allows two data centers to achieve layer-2 interconnection, just like directly using fiber optic connections. VLL technology adds two-layer labels to user messages for forwarding: outer labels (Tunnel labels) are MPLS labels or GRE labels used to transmit packets from one PE to another; an inner label (virtual circuit label) is an MPLS label used to identify the link between PE-CE. 9 shows the two-center L2 interconnection solution based on mpls vll technology. Figure 9. The mpls vll-based dual-center L2 interconnection solution VPLS is developed based on the traditional mpls vll solution and implements packet forwarding control through dual-layer labels. VPLS can achieve layer-2 interconnection between points and multiple points. VPLS technology can be used to simulate an Ethernet switch between multiple data centers on an MPLS network. Based on the MAC address or MAC address and vlan id, forwarding decisions between two-layer ports can be made. A vpls instance that implements layer-2 interconnection between multiple data centers includes multiple data centers connected to multiple PES and CE devices (aggregation switches) in the data center) communicates directly with all other CE (aggregation switch) associated with the VPLS instance. In the VLL and VPLS solution centers, the two unidirectional forwarding paths established between PES are called a virtual link. 10 shows a three-center L2 interconnection solution based on mpls vpls technology. Figure 10. based on the traditional network design, the mpls vpls-based three-center L2 interconnection solution is deployed only in the campus or data center, and the L2 network is not very wide, in addition, the physical links of L2 Networks are stable copper cables or optical fiber cables. The scheme of extending a L2 network to multiple data centers exceeds the requirements of L2 network protocols for network scalability and link quality adaptability. However, layer-3 networks (MPLS or IP addresses) are suitable for long-distance large-scale WAN networking. Compared with L2 Networks, L3 networks are relatively stable and have many fast convergence technologies (such as BFD) to ensure high network availability. Therefore, L3 networks can provide virtual links with stable link quality for L2 expansion, any issues occurring at the network physical layer are transparent to Layer 2 networks hosted on Layer 3 technology. 3.2.2 Layer 2 loops in the VLL and VPLS schemes prevent point-to-point VLL schemes from having no loops between PES. Multi-Point interconnected VPLS avoids inter-PE loops through horizontal segmentation. According to the previous article (key design points of the Spanning Tree Protocol), each data center is designed as an independent STP domain, so there is no loop in the data center. However, as the data center aggregation (CE) goes up to the PE through dual attribution, a loop appears in the whole multi-center L2 network, and a red dotted line of 11 appears. Therefore, when planning a multi-center L2 extension solution, we should consider the loop from an end-to-end perspective to avoid problems. Figure 11. the general idea of www.2cto.com is to extend the STP domain to the entire L2 network to avoid the multi-center loop. However, this will make the network fault domain too large and difficult to maintain and manage, form a passive situation of "moving the whole body. Therefore, the STP extension scheme is not available. Another solution to avoid loops is to use vswitch virtualization technology to virtualize two PE and CE devices into one logical device, and use dual-link binding between PE and CE to achieve end-to-end loop avoidance, Fig 12. loop problem of multi-center L2 interconnection 3.3 The VLL Over GRE or VPLS Over GRE scheme of the IP network. When multiple data centers only have IP networks, VLL Over GRE or VPLS Over GRE can be used to achieve layer-2 interconnection. The VLL and VPLS technical solutions need to establish a tunnel between the PE and the peer PE that can carry multiple virtual links (pseudo-wire). This tunnel can be an MPLS or GRE tunnel. For VLL or vpls mpls tunneling implementation, the outer label of the packet is the MPLS label, and the corresponding GRE tunneling method is IP + GRE. Therefore, the VLL Over GRE scheme and the VPLS Over GRE scheme have the same requirements as VLL and VPLS based on MPLS tunnel in terms of network topology, L2 loop avoidance, and other design elements. 3.4 Comparison and Analysis of the three solutions from the perspective of packet forwarding performance and fault convergence time, the above three solutions can achieve layer-2 interconnection between data centers and solve the end-to-end loop to avoid problems. The Hub-Spoke star topology solution based on optical transmission network and switch virtualization technology between data centers is simple in configuration management and highly scalable. If the interconnected data center is within the metropolitan area network, we recommend that you use this solution if you have sufficient optical fiber resources. This solution features high Link Utilization (packet sharing among multiple interconnected links) and fast fault convergence (less than 1 second). It can be used for point-to-point interconnection or point-to-Multi-Point interconnection. When no dedicated optical fiber resources are available between data centers, or because the distance is too long (more than 100 kilometers), the cost of deploying optical fiber is very high, you can select a VLL solution based on MPLS or ip gre tunnel to achieve layer-2 interconnection between point-to-point data centers, you can also select a VPLS Scheme Based on MPLS or ip gre tunnel to achieve layer-2 interconnection between point-to-point data centers. The differences between the three solutions are shown in table 1. Interconnection link cost interconnection expansion distance transmission efficiency maintainability applicable scenarios bare optical fiber/DWDM high less than 100 km high easy point-to-point, point-to-point mpls vll unlimited less easy point-to-point mpls vpls vll Over GRE www.2cto.com low unlimited low it is easier to point-to-point VPLS Over GRE Table 1 three solutions comprehensive comparison 4 Conclusion Data Center usually uses server layer 2 access solutions, to achieve flexible scalability. With the increasing demand for business continuity and flexible scheduling of computing resources, enterprises will inevitably face the problem of L2 network expansion between multiple data centers. Three different implementation schemes are proposed in this paper, which have the characteristics of stable topology, high availability and scalability. However, there are also deployment differences among various schemes. There is no best solution, but the most suitable one. I hope that through the elaboration and analysis in this article, I will give readers some help and inspiration so that they can select the most suitable technical solution for achieving multi-center interconnection in the future. From the H3C Official Website