Integration of I/O virtualization into storage technology

Last Update:2015-03-16 Source: Internet

Author: User

Keywords Virtualization mirroring each

Tags address address space based blade server computer configuration connected consortium

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Virtualization is the current hot topic, but its core problem is also virtual. That is, the operating system is usually "considered" to be running independently on the computer. But in fact there are several different operating systems that share a single computer, each operating system is called a system mirror (systems Image,si). Because the memory snap-in is integrated on the mainstream computer, this virtualization is entirely feasible on the mainstream computer. This technique is not particularly popular, however, because it simulates all the devices in the computer and has a great impact on performance. The recent development of hardware and software technology has improved the simulation speed, but still need to be improved.

I/O virtualization (IOVs) in the final analysis, is to simulate a device into multiple devices, the simulation of multiple devices can support a unique system image. Virtualization develops to the component level, freeing up system processors from heavy equipment simulations to dramatically improve performance. The PCI SIG Working Group is defining the mechanism for implementing virtual device interfaces on the PCI Express bus. Upon completion of the above work, it will provide the necessary standards for the mainstream IOVs design to ensure that the silicon chip solutions of different vendors can work together on different platforms of different operating systems.

We use the current PCI Express front-end logic with I/O virtualization technology, whether the successful implementation of chip virtual? The answer is both positive and negative. Admittedly, the chip can be assigned to multiple system mirrors at the same time, but how to virtualize the backend of the device is the key to the problem. We might as well imagine the problem with the storage controller. Obviously, we want to partition the attached storage so that the system mirror X can have space specifically allocated to that partition, and system mirror y cannot access that space.

PCI Express I/O virtualization

Let's start with a brief discussion of the system view of I/O virtualization. System mirroring refers to the actual or virtual system composed of CPU, memory, I/O, etc. Multiple system mirrors can run on one or more sets of real hardware. For example, system management programs such as VMWare can run Windows XP and Linux simultaneously on a single core CPU desktop computer, in which case two system mirrors share a CPU, memory, disk drive, and so on. As an example of a blade server, one blade runs Windows XP while the other is running Linux, and each system image, while not sharing the hardware on the CPU blade, actually potentially shares the hardware on the I/O blade.

Regardless of the physical allocation, each system mirror should "see" its own PCI hierarchy. Even if no terminal devices are shared, such as the two Fibre Channel directors on the I/O blades, one assigned to the Linux blades, the other is assigned to the XP blades, and some control must be taken to visualize the PCI hierarchy. If you share terminal equipment, you must limit the mirroring of each system so that it can only "see" the part of your shared terminal device that belongs to you.

The device needs to simulate its physical hardware into multiple virtual devices so that outside observers think the virtual devices are completely independent. Virtual devices can do the following: One is to occupy different PCI memory area, and the other is to use a variety of PCI configuration memory different settings, three can be used as a PCI multi-function equipment. In addition, the device should ensure that it is isolated across "device" traffic, thus avoiding data overflow between virtual devices.

As you can see from the example above, a system with a single point attached to a PCI hierarchy is significantly different from a system that is connected to a PCI-level architecture. Traditional single-core CPU desktops, and even traditional multi-core CPU servers, were previously connected to PCI-level architectures through a single logical connection point. Correspondingly, the blade system supports a new level view, and the upper-layer Enhanced PCI Express switch can be connected to the entire PCI hierarchy through multiple root associations (root Complex). Therefore, we obviously need some new mechanisms to ensure that each root consortium accesses the part of the PCI hierarchy that is assigned to it.

In view of the significant differences between the two system types, as well as market segmentation considerations and avoidance of complexity, PCI SIG decides to divide the I/O virtualization specification into two parts. Since each root consortium can also take advantage of single I/O virtualization, the two specifications should be interdependent, which creates a concentric pattern: the single root specification is based on the basic specification of PCI Express, while Dogan (Multi-root) The specification is based on a single specification.

Single I/O virtualization

Single I/O virtualization is primarily geared towards the existing PCI hierarchy, in which one-and multicore-CPU computers are connected to a PCI-level architecture via a single point. One of the goals of a single specification is to support the continued use of existing root consortium chips, but this can be a major limitation. In the same way, supporting existing switching chips can also cause certain limitations. Considering the above requirements, from the bus point of view, only a single memory address space. The partition of a virtualized system mirror is assigned to the upper level of the root Federation connection point. We generally believe that there will be some type of address translation logic within or above the root consortium to support virtualized middleware, often referred to as a system management program, to implement the mapping function. Users, of course, need new I/O virtualization endpoint devices, which also face enormous challenges in terms of design and support. They want to not modify the chipset, in which case the virtualization market can quickly extend to existing systems or simple derivative systems, but this creates a larger software burden when performing virtualized middleware functions.

Dogan I/O Virtualization

Although the PCI Express cable specification offers many other possibilities, the most typical implementation paradigm for the multi-point connection hierarchy is still a blade server with a PCI Express backplane. This is a new PCIe hierarchy, basically a miniature architecture. The PCI SIG target is a "miniaturized" system, that is, the system typically does not carry more than 3 feet in volume. There may be a maximum of 16 to 32 root ports, but the hierarchy also supports more root ports. In addition, there is a goal to continue to use the existing root consortium chip. However, unlike a single root, assuming that there is no virtualization middleware, the complexity of the system partition is shifted to the new enhanced PCI Express switch. Multi-root systems are specifically designed to divide the PCI hierarchy into multiple virtual hierarchies and share the same physical schema. Single-root systems have only a single memory address space and are partitioned in their system mirrors, while multiple-root systems provide a full 64-bit memory address space for each virtual schema. The configuration management software collaborates with enhanced switches and I/O Virtualization Device programs to program the hierarchy. In this way, each root consortium can "see" all the multiple root hierarchies in this section, just like a single hierarchical architecture. Each of these schema views is called a virtual schema. It should be noted that each virtual architecture of multiple root systems can be individually enabled or not enabled, so that endpoint devices in multiple root systems face the challenge of two pattern hierarchies.

In the case of virtualized devices, each system mirror should see its own copy of the virtualized configuration space and address map. In fact, the device requires an N-group PCI configuration space to support N-group virtual capabilities. A single specification defines lightweight virtual functionality, while multiple-root specifications require full configuration space for the virtual schemas available for each device. It is difficult for me to describe the specific differences of different types of configuration space, which involves the problem of advanced virtualization. But for this article, just to be specific, each system mirror that interacts with the IOVs device has its own device address range and configuration space. Therefore, the IOVs device can establish a working relationship with a particular system image based on which address space is accessed.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More