Virtio: I/O virtualization Framework for Linux

Source: Internet
Author: User

from:http://www.ibm.com/developerworks/cn/linux/l-virtio/

In general, virtio it is the abstraction layer on the device in the semi-virtualized hypervisor. virtiodeveloped by Rusty Russell, he was designed to support his own virtualization solution lguest . This article begins with the introduction of semi-virtualized and simulated devices, and then explores virtio the details. The focus of this article is on the framework from the 2.6.30 kernel release. virtio

Linux is the hypervisor booth. As explained in my Anatomy of Linux hypervisor, Linux offers a variety of hypervisor solutions that have their own features and benefits. These solutions include kernel-based Virtual machine (KVM), lguest and User-mode Linux. Having these different hypervisor solutions on Linux can be a burden on the operating system, depending on the needs of each solution. One of these costs is virtualization of the device. virtioinstead of providing multiple device emulation mechanisms (for networks, blocks, and other drivers), it provides a common front end for these device simulations, standardizing interfaces and increasing cross-platform reuse of code.

Full virtualization and para-virtualization join the Green Group on My DeveloperWorks

Discuss topics on energy, efficiency, and the environment and share resources on the Green IT report space and the ECO Computing team on My DeveloperWorks.

Let's take a quick look at two different types of virtualization modes: Full virtualization and semi-virtualized. In full virtualization , the guest operating system is running on top of the hypervisor on the physical machine. The guest operating system does not know that it has been virtualized and does not require any changes to work in that configuration. Conversely, in semi-virtualized , the guest operating system not only knows that it is running on hypervisor, but also contains code that allows the guest operating system to transition to hypervisor more efficiently (see Figure 1).

In full virtualization mode, the hypervisor must emulate the device hardware, which is emulated at the lowest level of the session (for example, a network driver). Although simulation is clean in this abstraction, it is also the least efficient and most complex. In the semi-virtualized mode, the guest operating system and hypervisor can work together to make the simulation more efficient. The disadvantage of a semi-virtualized approach is that the operating system knows that it is virtualized and needs to be modified to work.

Figure 1. Device emulation in a fully virtualized and semi-virtualized environment

Hardware is constantly changing with virtualization technology. The new processor makes the transition from guest operating systems to hypervisor more efficient by incorporating advanced directives. In addition, hardware is constantly changing with input/output (I/O) virtualization (see Resources for peripheral Controller interconnect [PCI] passthrough and Single-and multi-root I/O virtual ).

Virtio's replacement

virtiois not the only overlord in the field. Xen provides semi-virtualized device drivers, and VMware also provides Guest Tools.

However, in a traditional, fully virtualized environment, hypervisor must capture these requests and then simulate the behavior of the physical hardware. While this provides a lot of flexibility (that is, running an unchanged operating system), it is less efficient (see Figure 1, left). The right side of Figure 1 is a semi-virtualized example. Here, the guest operating system knows that it is running on hypervisor and contains a driver that acts as a front end. Hypervisor implements back-end drivers for specific device simulations. By providing standardized interfaces for developing analog devices in these front-end and back-end drivers, the virtio cross-platform reuse rate and efficiency of the code is increased.

Back to top of page

An abstraction for Linux

As you can see from the previous section, virtio it is an abstraction of a set of generic analog devices in half-virtualized hypervisor. This setting also allows hypervisor to export a common set of analog devices and make them available through a common Application programming interface (API). Figure 2 shows why this is important. With semi-virtualized hypervisor, the guest operating system is able to implement a common set of interfaces followed by a specific device emulation after a set of back-end drivers. Back-end drivers do not need to be generic because they only implement the behavior required by the front end.

Figure 2. Virtio Driver Abstraction

Note that, in reality (although not required), device impersonation occurs in the use of qemu space, so the back-end driver interacts with Hypervisor's user space to facilitate the I/O with QEMU. QEMU is a system emulator that provides not only a guest operating system virtualization platform, but also a simulation of the entire system (PCI host controller, disk, network, video hardware, USB controller, and other hardware elements).

virtioThe API relies on a simple buffer abstraction to encapsulate the commands and data required by the guest operating system. Let's look virtio at the inside of the API and its components.

Back to top of page

Virtio Architecture

In addition to the front-end drivers (implemented in the guest operating system) and the back-end drivers (implemented in hypervisor), virtio two tiers are defined to support the guest operating system to hypervisor communication. At the top level (known as virtio) is the virtual queue interface, which conceptually attaches the front-end driver to the back-end driver. The driver can use 0 or more queues, depending on the requirements. For example, the virtio network driver uses two virtual queues (one for receiving and the other for sending), while the virtio block driver uses only one virtual queue. Virtual queues are actually implemented as points of convergence across guest operating systems and hypervisor. But this can be done in any way, provided that the guest operating system and the hypervisor implement it in the same way.

Figure 3. High-level architecture of the vital framework

As shown in Figure 3, there are 5 front-end drivers listed for block devices (such as disks), network devices, PCI simulations, and balloon drivers, respectively. Each front-end driver has a corresponding back-end driver in the hypervisor.

Conceptual hierarchy

From the perspective of the guest operating system, the definition of the object hierarchy is shown in Figure 4. At the top of the virtio_driver list, it represents the front-end driver in the guest operating system. The device that matches the driver is virtio_device encapsulated by (the device is represented in the guest operating system). This refers virtio_config_ops to the structure (which defines virtio the operation of the configuration device). virtio_device virtqueue referenced by a reference (it contains a reference to its service virtio_device ). Finally, each virtqueue object references the virtqueue_ops object, which defines the underlying queue operation that handles the hypervisor driver. Although the queue operation is at virtio the heart of the API, I'll start with a brief discussion of the new discovery and then discuss the operation in detail virtqueue_ops .

Figure 4. Virtio the object hierarchy of the front-end

The process to create   virtio_driver   and start with   register_virtio_driver   registration. The virtio_driver   structure defines the list of upper device drivers, the device IDs supported by the driver, an attribute form (depending on the device type), and a list of callback functions. &NBSP is called when hypervisor identifies a new device that matches the device ID in the device list, probe   function (by   Virtio_driver   object) to pass in   virtio_device   object. Cache the management data for this object and the device (cached in a driver-independent manner). You may want to call   virtio_config_ops   function to get or set device-specific options, for example, for   virtio_blk   device gets disk Read/write the state or sets the block size of the block device, depending on the type of initiator.

Note that virtio_device the references are not included virtqueue (but virtqueue do refer to them virtio_device ). To identify the virtio_device associated virtqueue , you need to use virtio_config_ops objects and functions in conjunction find_vq . The object returns the virtio_device virtual queue associated with this instance. The find_vq function also allows you to virtqueue specify a callback function (see the structure in Figure 4 virtqueue ).

virtqueueis a simple structure that identifies an optional callback function (called when hypervisor uses the buffer pool), a reference to the action, a reference to the virtio_device virtqueue operation, and a special reference to the underlying implementation to use priv . Although callback Optional, it is able to dynamically enable or disable callbacks.

The core of the hierarchy is virtqueue_ops that it defines how commands and data are moved between the guest operating system and hypervisor. Let's first explore the objects that are added to or virtqueue removed from.

Virtio Buffer Pool

The guest operating system (front-end) driver interacts with the hypervisor through the buffer pool. For I/O, the guest operating system provides one or more buffer pools that represent requests. For example, you can provide 3 buffer pools, the first representing a Read request, and the next two representing the response data. The configuration is internally represented as a hash list (Scatter-gather), and each entry in the list represents an address and a length.

Core API

virtio_device virtqueue Link The guest operating system driver to the hypervisor driver through and (more commonly). virtqueuesupports its own API consisting of 5 functions. You can use the first function add_buf to provide a request to the hypervisor. As mentioned earlier, the request exists as a list of hashes. For add_buf , the guest operating system provides the number of buffer pools used to add requests to the queue, the virtqueue hash list (address and length arrays), as the output entry (the target is the underlying hypervisor), and as an input entry (hypervisor The number of buffer pools for which data will be stored and returned to the guest operating system. When a add_buf request is made to hypervisor, the guest operating system can kick notify hypervisor of a new request through a function. For best performance, the guest operating system should kick mount as many buffer pools as possible before issuing notifications virtqueue .

get_bufthe response from the hypervisor is triggered by a function. The guest operating system can implement polling only by invoking the function or by virtqueue callback waiting for notification through the provided function. When the guest operating system knows that the buffer is available, the call get_buf returns the completed buffer.

virtqueueThe last two functions of the API are the enable_cb and disable_cb . You can use these two functions to enable or disable the callback process (by virtqueue virtqueue initializing the callback function in). Note that the callback function and hypervisor are in separate address spaces, so the call is triggered by an indirect hypervisor (for example kvm_hypercall ).

The format, order, and content of the buffers are only meaningful to the front-end and back-end drivers. Internal transports (connection points in the current implementation) move only the buffers and do not know their internal representations.

Back to top of page

Example Virtio driver

You can find the source code for various front-end drivers in the./drivers subdirectory of the Linux kernel. You can find the network driver in./drivers/net/virtio_net.c virtio and find the block driver in./drivers/block/virtio_blk.c. virtio Subdirectories./drivers/virtio provides virtio an implementation of the interface ( virtio device, driver, virtqueue and connection point). virtioIt is also used in High-performance Computing (HPC) research to develop inter-virtual machine (VM) traffic that is passed through shared memory. In particular, this is achieved by using virtio the PCI driver's virtualized PCI interface. You can learn more about this knowledge point in the Resources section.

You can now practice this semi-virtualized infrastructure in the Linux kernel. What you need includes a kernel that acts as a hypervisor, a guest operational kernel, and QEMU for device emulation. You can use KVM (a module in the host kernel) or Rusty Russell (the lguest modified version of the Linux guest OS kernel). Both virtualization solutions are supported virtio (as well as QEMU for system emulation and for virtualization management libvirt ).

Rusty is a cleaner lguest code base for semi-virtualized drivers and faster simulation of virtual devices. But more importantly, the practice proves to virtio provide better performance than existing commercial solutions (network I/O can be up to 2-3 times faster). The performance boost comes at a cost, but if you use Linux as the hypervisor and guest operating system, it's worthwhile to do so.

Back to top of page

Conclusion

Perhaps you have never virtio developed a front-end or back-end driver for your development, it implements an interesting architecture that deserves your careful exploration. virtionew opportunities to improve efficiency in a semi-virtualized I/O environment, while leveraging previous results from Xen. Linux continues to prove that it is a product hypervisor and a new virtualization technology research platform. virtioThis example shows the power and openness of using Linux as a hypervisor.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.