KVM Introduction (3): I/O full virtualization and quasi-virtualization [KVM I/O QEMU Full-virtualizaiton para-virtualization]

Source: Internet
Author: User

Learn about KVM in a series of articles:

    • (1) Introduction and Installation
    • (2) CPU and memory virtualization
    • (3) I/O QEMU full virtualization and quasi-virtualization (Para-virtulizaiton)
    • (4) I/O pci/pcie Device Direct Assignment and SR-Iov
    • (5) Libvirt Introduction
    • (6) Nova manages QEMU/KVM virtual machine via Libvirt
    • (7) Snapshot (snapshot)
    • (8) Migration (migration)

In QEMU/KVM, the devices that the client can use are broadly divided into three categories: 1. Analog Devices: Devices that are completely simulated by QEMU pure software. 2. Virtio device: A semi-virtualized device that implements the Virtio API. 3. Direct allocation of PCI devices (PCI device Assignment). 1. Fully virtualized I/O device KVM in IO virtualization, the traditional or default way is to use QEMU pure software to simulate I/O devices, including keyboards, mice, monitors, hard disks and network cards. Analog devices may use physical devices or use pure software to simulate them. Analog devices exist only in the software. 1.1 Principle
    1. The client's device driver initiates an I/O request operation request
    2. I/O operations in the KVM module capture code to intercept this I/O request
    3. After processing, the information for this I/O request is placed on the I/O Shared page (sharing page) and the QEMU program of the user space is notified.
    4. After the QEMU program obtains specific information about the I/O operation, it is referred to the hardware emulation code to simulate this I/O operation.
    5. When finished, QEMU puts the results back into the I/O shared page and notifies the I/O operation capture code in the KMV module.
    6. The capture code of the KVM module reads the results of the operation on the I/O shared page and puts the results back to the client.  
Note: When a client accesses large chunks of I/O through DMA, the QEMU emulator will not put the results into a shared page, but instead write the results directly to the client's memory in a memory-mapped way. The KVM module is then notified that the client DMA operation is complete. The advantage of this approach is that it can simulate a variety of hardware devices, the disadvantage is that the path of each I/O operation is longer, requires multiple context switches, and requires multiple data replication, so performance is poor. Implementation of 1.2 QEMU analog network card

Qemu Software-only way to simulate I/O devices, including frequently used NIC devices. When the Guest OS Boot command does not have an incoming network configuration, QEMU assigns the rtl8139 type of virtual network card by default, using the default User Configuration mode, which is limited by the absence of a specific network mode configuration. In the case of full virtualization, the network modes that KVM VMS can select include:

    1. Default User mode (username);
    2. (bridge) based mode;
    3. Nat (Network Address translation)-based mode;

The QEMU-KVM parameters used for each are:

    • -net User[,vlan=n]: Use the user-mode network stack so that no administrator privileges are required to run. This will be the default if the-net option is not specified. -net Tap[,vlan=n][,fd=h]
    • -net nic[,vlan=n][,macaddr=addr]: Create a new NIC and connect to VLAN N (n=0 by default). As an optional item, the MAC address can be changed. If the-net option is not specified, a single NIC is created.
    • -net Tap[,vlan=n][,fd=h][,ifname=name][,script=file]: connect the TAP network interface name to VLAN N and configure it with a network configuration script file. The default network configuration script is/etc/qemu-ifup. If you do not specify Name,os, one will be automatically specified. The fd=h can be used to specify a handle to a tap host interface that is already open.

Bridge mode is now relatively simple, but also used in a more multi-mode, is the bridge mode of the VM's transceiver package flow.

As shown in, the Red Arrows indicate the direction of the data message, and the steps:

    1. The network data is received from the physical network card on the Host and reaches the network bridge;
    2. Since eth0 and TAP1 are all joined in the bridge, according to the two-layer forwarding principle, br0 forwards the data from the TAP1 port, that is, the data is received by the TAP device;
    3. The TAP device notifies the corresponding FD data to be readable;
    4. The read action of FD copies the data to the user space through the character device drive of the tap device, and completes the front-end reception of the data message.

(Quoted from http://luoye.me/2014/07/17/netdev-virtual-1/)

1.3 RedHat The analog devices available in Linux 6
    • Analog Graphics: 2 analog video cards available.
    • System components:
      • Ntel I440FX host PCI Bridge
      • PIIX3 PCI to ISA Bridge
      • PS/2 Mouse and keyboard
      • Evtouch USB Graphics Tablet
      • PCI UHCI USB controller and a virtualized USB hub
      • Emulated serial ports
      • EHCI controller, virtualized USB storage and a USB mouse
    • Analog sound card: Intel-hda
    • Analog network card: e1000, analog Intel E1000 network card, rtl8139, analog Realteck 8139 network card.
    • Analog memory card: Two analog PCI IDE interface cards. KVM restricts a maximum of 4 virtual memory cards per virtual machine. There is also an analog floppy drive.

Note: RedHat Linux KVM does not support SCSI emulation.

KVM virtual machines will use these default virtual devices without explicitly specifying the use of other types of devices. As described above, the KVM virtual machine uses the rtl8139 NIC by default. For example, after starting the KVM RedHat Linux 6.4 virtual machine on the RedHat linxu 6.5 host, log on to the virtual machine and view the PCI device, you can see these analog devices: When using "-net nic,model=e1000" to specify the NIC model as E100 0 o'clock, 1.4 QEMU-KVM key options for disk devices and networks
Type Options
Disk devices (floppy disk, hard disk, CDROM, etc.)
-  Drive option[,option[,option[,...]] : Defines a hard disk device with many available sub-options.    file=/path/to/somefile: Hardware image file path;    if=interface: Specifies the type of interface to which the hard drive device is connected, that is, the controller type, such as IDE, SCSI, SD, MTD, Floppy, Pflash and Virtio,    index=index: Sets the index number of the different devices in the same controller type, namely the identification number;    Media=media: Defines whether the media type is hard disk or disc (CDROM);        Format=format: Specifies the format of the image file, which can be found in the qemu-img command;-boot [Order=drives][,once=drives][,menu=on|off]: Defines the boot order of the boot device , each device is represented by one character, and the devices supported by different architectures and their representations are not the same, on the x86 PC architecture, A, b means the floppy drive, C represents the first drive, D represents the first optical drive device, the N-P represents a network adapter, and the default is the hard drive device (-boot ORDER=DC, Once=d)
Internet -net nic[,vlan=n] [,macaddr=mac][,model=type][,name=name][,addr=addr][,vectors=v]:创建一个新的网卡 设备并连接至vlan n中;PC架构上默认的NIC为e1000,macaddr用于为其指定MAC地址,name用于指定一个在监控时显示的网上设备 名称;emu可以模拟多个类型的网卡设备;可以使用“qemu-kvm -net nic,model=?”来获取当前平台支持的类型;-net tap[,vlan=n] [,name=name][,fd=h][,ifname=name][,script=file][,downscript=dfile]:通过物理机 的TAP网络接口连接至vlan n中,使用script=file指定的脚本(默认为/etc/qemu-ifup)来配置当前网络接口,并使用 downscript=file指定的脚本(默认为/etc/qemu-ifdown)来撤消接口配置;使用script=no和 downscript=no可分别用来禁止执行脚本;-net user[,option][,option][,...]:在用户模式配置网络栈,其不依赖于管理权限;有效选项有:    vlan=n:连接至vlan n,默认n=0    name=name:指定接口的显示名称,常用于监控模式中;    net=addr[/mask]:设定GuestOS可见的IP网络,掩码可选,默认为10.0.2.0/8    host=addr:指定GuestOS中看到的物理机的IP地址,默认为指定网络中的第二个,即x.x.x.2    dhcpstart=addr:指定DHCP服务地址池中16个地址的起始IP,默认为第16个至第31个,即x.x.x.16-x.x.x.31    dns=addr:指定GuestOS可见的dns服务器地址;默认为GuestOS网络中的第三个地址,即x.x.x.3    tftp=dir:激活内置的tftp服务器,并使用指定的dir作为tftp服务器的默认根目录;    bootfile=file:BOOTP文件名称,用于实现网络引导GuestOS;如:qemu -hda linux.img -boot n -net user,tftp=/tftpserver/pub,bootfile=/pxelinux.0

For network cards, you can use the Modle parameter to specify the type of virtual network. The types of virtual networks supported by RedHat Linux 6 are:

[Email protected] isoimages]# kvm-net nic,model=? qemu:supported NIC models:ne2k_pci,i82551,i82557b,i82559er,rtl8139,e1000,pcnet,virtio  
2. Quasi-virtualized (Para-virtualizaiton) I/O driver virtio can use a quasi-virtualized driver in KVM to provide client I/O performance. Currently KVM uses  virtio, a device-driven standard framework on Linux, which provides an IO framework for Host-to-Guest interaction. The Vitio implementation of the architecture  kvm/qemu of the 2.1 virtio takes the form of installing a front-end driver (front-end driver) in the Guest OS kernel and implementing a back-end driver (back-end) in QEMU. The front-end drive communicates directly through the vring, bypassing the process of the KVM kernel module to improve I/O performance.    the difference between a pure software simulation device and a Virtio device: Virtio eliminates the exception capture link in pure analog mode, and the Guest OS can communicate directly with QEMU's I/O module.    using Virtio's full VM I/O process:   host data to Guest:1. KVM notifies QEMU in an interrupted manner to fetch data and put it into the Virtio queue 2. KVM then notifies Guest to fetch data from the Virtio queue. The implementation of 2.2 Virtio in Linux  virtio is an abstraction of a set of generic analog devices in a hypervisor. This design allows the hypervisor to provide a common set of analog devices through an application programming interface (API). By using the hypervisor, the client implements a common set of interfaces to match a later set of back-end device simulations. Back-end drivers do not have to be generic, as long as they implement the behavior required by the front end. As a result, Virtio is an abstract API interface on top of  hypervisor that allows clients to know that they are running in a virtualized environment, collaborating with Hypervisor according to Virtio standards, so that clients can achieve better performance.
    • Front-end drivers: Driver modules installed in the client
    • Back-end drivers: implemented in QEMU, invoking physical devices on the host, or fully implemented by the software.
    • Virtio Layer: A virtual queue interface that conceptually connects front-end drivers and back-end drivers. The driver can use a different number of queues as needed. For example, Virtio-net uses two queues, and Virtio-block uses only one queue. The queue is virtual and is actually implemented using virtio-ring.
    • Virtio-ring: Implementing a ring buffer for a virtual queue
Five front-end drivers implemented in the Linux kernel:
    • Block devices (such as disks)
    • Network equipment
    • PCI devices
    • Balloon Driver (dynamic management of client memory usage)
    • Console driver
In the Guest OS, these drivers are not loaded when the Virtio device is not used. The corresponding driver will be loaded only if a virtio device is used. Each front-end drive has a corresponding back-end driver in the hypervisor. Take Virtio-net as an example to explain its principle:

(1) The principle of virtio-net:

It enables:
    1. Multiple virtual machine shared host Nic Eth0
    2. QEMU uses standard Tun/tap to bridge the virtual machine's network to the host card
    3. Each virtual machine appears to have a private virtio network device connected directly to the host PCI bus
    4. Need to install the Virtio driver inside the virtual machine

(2) The process of virtio-net:

Summarize the pros and cons of Virtio:
    • Pros: Higher IO performance, almost the same as the native system.
    • Disadvantage: The client must have a specific Virtio driver installed. Some old Linux have no driver support, some Windows need to install a specific driver. However, newer and mainstream OS drivers are available for download. Linux 2.6.24+ all support Virtio by default. You can use Lsmod | grep Virtio to see if it is already loaded.
2.3 Using the Virtio device (take virtio-net as an example)

It is simpler to use a device of the virtio type. Virtio drivers are already installed on newer versions of Linux, and Windows drivers need to be downloaded and installed on their own.

(1) Check whether the Virtio type of NIC device is supported on the host

[Email protected] isoimages]# kvm-net nic,model=? qemu:supported NIC models:ne2k_pci,i82551,i82557b,i82559er,rtl8139,e1000,pcnet,virtio 

(2) Specify the network card device model is Virtio, start the virtual machine

(3) through the Vncviewer login virtual machine, you can see the loaded virtio-net required kernel modules

(4) View PCI device

Other virtio types of devices are used in a manner similar to virtio-net.

2.4 vhost-net (kernel-level virtio server)

The back-end handlers (backend) mentioned earlier in the Virtio host are typically provided by QEMU in the user space, but if the back-end processing for network I/O requests can be done in kernel space, it will be more efficient to improve network throughput and reduce network latency. In the newer kernel, there is a driver module called "Vhost-net", which is a kernel-level backend handler that puts Virtio-net's back-end processing tasks into the kernel space and reduces the switching of kernel space to user space, thus increasing efficiency.

According to this article on the KVM website, vhost-net can provide a lower latency (latency) (10% lower than the e1000 virtual network card), and a higher throughput (throughput) (8 times times the normal virtio, probably 7~8 gigabits/sec).

Comparison of Vhost-net and virtio-net:

Vhost-net's requirements:

    • qemu-kvm-0.13.0 or above
    • Set config_vhost_net=y in the host kernel and set config_pci_msi=y in the virtual machine operating system kernel (Red Hat Enterprise Linux 6.1 begins to support this feature)
    • Using the Virtion-net front-drive driver within the client
    • Use bridge mode within the host, and start vhost_net

The-net tap of the QEMU-KVM command has several options related to vhost-net:-net tap,[,vnet_hdr=on|off][,vhost=on|off][,vhostfd=H[, Vhostforce=on|off]

    • VNET_HDR =on|off: sets whether to turn on the "IFF_VNET_HDR" identification of the TAP device. "Vnet_hdr=off" means that the identity is closed, "Vnet_hdr=on" forces the identity to be turned on, and if there is no support for this identity, an error is triggered. IFF_VNET_HDR is an identity of Tun/tap, and opening it allows only partial checksum checks when sending or accepting large packets. This identification can be turned on to increase the throughput of the Virtio_net drive.
    • Vhost=on|off: sets whether to turn on vhost-net, a back-end processing driver for this kernel space, which is only valid for Virtio clients that use the Mis-x interrupt mode.
    • Vhostforce=on|off: sets whether to force the back-end handlers for Virtio clients that use Vhost as non-msi-x interrupts.
    • vhostfs=h: set to go to connect a vhost network device that is already open.

Examples of use of vhost-net:

(1) Ensure that the Vhost-net kernel module on the host is loaded

(2) Start a virtual machine, use-net in the client to define a virtio-net NIC, use-netdev boot on the host side Vhost

(3) on the virtual machine side, see the Virtio network card used by the TAP device is tap0.

(4) in the host see Vhost-net is loaded and used, as well as the Linux Bridge Br0, which connects the physical network card eth1 and the client uses the TAP device tap0

In general, using vhost-net as a back-end processing driver can improve network performance. However, for some network load types using vhost-net as the backend, it is possible to make their performance non-descending. In particular, UDP traffic from the host to the client, which is prone to performance degradation if the client handles receiving data at a slower rate than the host sends. In this case, using vhost-net will be the UDP socket's accept buffer to overflow more quickly, resulting in more packet loss. Therefore, in this case, do not use vhost-net, let the transmission speed slightly slower, but will improve the overall performance.

With the QEMU-KVM command line, with "Vhost=off" (or no vhost option), Vhost-net is not used, and when using Libvirt, the network configuration portion of the client's configuration XML file needs to be configured as follows, specifying the name of the back-end driver as " Qemu "(instead of" vhost ").

<interface type= "Network" >


<model type= "Virtio"/>

<driver name= "Qemu"/>



2.6 Virtio-balloon

Another relatively special virtio device is the Virtio-balloon. In general, to change the host memory occupied by the client, first shut down the client, modify the memory configuration at startup, and then restart the client to do so. The memory ballooning (balloon) technology can dynamically adjust the host memory resources it consumes while the client is running, without shutting down the client. The technology can:

    • When the host memory is tight, the client can be requested to reclaim some of the memory that has been allocated to the client, and the client will free some of the free memory. If there is not enough memory space, some of the memory in use may be reclaimed, and some of the memory may be swapped into the swap partition.
    • When the client memory is low, it can also compress the client's memory balloon, release some memory in the memory balloon, and let the client use more memory.

Many of the current VMM, including KVM, Xen,vmware, and so on, are supporting ballooning technology. Among them, the ballooning in the KVM is implemented through the host and client collaboration, the host should use the 2.6.27 and the above version of the Linux kernel (including the KVM module), using the newer QEMU-KVM (such as version 0.13 or more), in the client also use 2.6.27 and above the kernel and configure "Config_virtio_balloon" as a module or compile to the kernel. In many Linux distributions have been configured with "Config_virtio_balloon=m", so with newer Linux as a client system, generally do not need to configure additional Virtio_balloon drivers, using the default kernel configuration. Principle:
    1. KVM sends a request to the VM to return a certain amount of memory to the KVM.
    2. The Virtio_balloon of the VM is driven to the request.
    3. The driver of the VM is the expansion of the client's memory balloon, and the memory in the balloon cannot be used by the client.
    4. The VM's operating system returns the memory in the balloon to the VMM
    5. KVM can allocate the resulting memory to wherever it is needed.
    6. KM can also return memory to the client.

Advantages and Disadvantages:

Advantage Insufficient
  1. Ballooning can be controlled and monitored
  2. The memory adjustment is very flexible, can be much less.
  3. KVM can restore memory to the client, thereby easing its memory pressure.
  1. Client Installation driver required
  2. When a large amount of memory is reclaimed, the performance of the client is reduced.
  3. There is no convenient automated mechanism for managing ballooning, which is typically done in QEMU's monitor.
  4. The dynamic increase or decrease in memory may be excessive fragmentation of memory, thereby reducing memory usage performance.

In Qemu Monitor, two commands are provided to view and set the size of the client memory.

    • (QEMU) Info Balloon #查看客户机内存占用量 (balloon information)
    • (QEMU) balloon num #设置客户机内存占用量为numMB
Usage examples:

(1) Start a virtual machine, memory is 2048M, enable Virtio-balloon

(2) Enter the virtual machine via Vncviewer to view the PCI device

(3) Look at the memory situation, a total of 2G memory

(4) Enter QEMU Monitor and adjust balloon memory to 500M

(5) Return to virtual machine, view memory, change to

2.7 RedHat Multi-queue Virtio (multi-queue) the current high-end server has multiple processors, and the number of virtual CPUs used is increasing. The default virtio-net cannot transmit or receive network packets in parallel because the virtio_net has only one TX and RX queue. Multi-queue virtio-net provides a way to enhance network performance as virtual CPUs increase, allowing Virtio to use multiple virt-queue queues at the same time. It has the obvious advantage in the following cases:
    1. Very large network traffic
    2. Virtual machine has a lot of network connections, including virtual machine between, virtual machine to the host, virtual machine to the external system and so on
    3. The number of Virtio queues is the same as the number of virtual CPUs for a virtual machine. This is because multiple queues allow one queue to monopolize a single virtual CPU.
Note: The queue virtio-net works very well for incoming network streams, but the external traffic occasionally degrades performance. Opening the queue Virtio increases the throughput, which correspondingly increases the burden on the CPU. It is necessary to do the necessary testing in the actual production environment before determining whether to use it. In RedHat, to use multi-queue virtio-net, add the following configuration to the virtual machine's XML file: then run the following command on the host:
1 <= M <= N)
2.8 Windows Clients Virtio front-end drivers for virtio front-end drivers under Windows clients must be downloaded and manually installed. RedHat Linux This article explains how to install the Virtio driver within a Windows client. Reference Documentation:
    • Http://linux.web.cern.ch/linux/centos7/docs/rhel/Red_Hat_Enterprise_Linux-7-Virtualization_Tuning_and_ Optimization_guide-en-us.pdf
    • Http://toast.djw.org.uk/qemu.html
    • KVM Official documentation
    • The practical and analytic Zhinyongjie of KVM virtualization technology, Tan Haitao
    • RedHat Linux 6 Official documentation
    • Some documents about KVM in http://www.slideshare.net
    • Http://www.linux-kvm.org/page/Multiqueue
    • And partly from the web, like http://smilejay.com/2012/11/use-ballooning-in-kvm/.

KVM Introduction (3): I/O full virtualization and quasi-virtualization [KVM I/O QEMU Full-virtualizaiton para-virtualization]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.