How is a virtual machine implemented?

Source: Internet
Author: User
Tags virtual environment

Translated from: http://www.manio.org/cn/virtual-machine-implementation/

For 997 years, Stanford's Mendel Rosenblum with Edouard Bugnion, Scott Devine, on Sosp, a paper called disco:running Commodity operating systems on Scalable multiprocessors (http://research.cs.wisc.edu/areas/os/Qual/papers/disco.pdf). After the hair, I think they should think that this idea is too good to open a company called VMware.

This paper is called disco (disco) because the virtual machine itself is not a new thing, probably in the 70 's. In order to pay tribute, or to show that this is a retro thing, the authors named the Project Disco. This paper introduces the key technology of virtual machine, which is suitable for answering this question. (Years later, another paper on OSDI (Memory Resource Management in VMware ESX Server) introduced some of the improvements in VMware. In recent years, more and more papers. )

Why did they make virtual machines? Simply put, the new hardware is endless, but the OS can't catch up. At first, they wanted to run IRIX (an operating system) on Stanford's ccNUMA machine. But Irix couldn't get away. They think it's too hard to modify the OS or write a new OS (because an operating system takes a long time from birth to maturity, countless bugs to fix, countless new features to add ...) So how will the Doctor graduate. )。 So they decided to use a virtual machine.

For their projects, virtual machines have roughly the following benefits:

    1. Simply modifying the business OS allows them to share memory among multiple VMs.
    2. Flexible. In addition to the IRIX in the paper, the other OS can actually run.
    3. Scalability is good. The system can be extended in the virtual machine unit.
    4. Fault-containment. Each VM is an almost independent individual, one that is broken and does not affect the other.
    5. New and old software can coexist. For example, the new software can only run in Linux-3.15. You can use two VMs, one is Linux2.6, the other is Linux-3.15.

This article answers the following key questions to implement virtual technology:

    1. How does the VMM (Virtual machine Monitor, or hypervisor) control the guest OS? Or, how does the Emperor (VMM) prevent the Minister (OS) from seizing power?
    2. With so many operating systems running together, how is memory managed?
    3. How are the resources shared between multiple VMS? Or, how can 1GB memory be used for 2GB?

Note: If not specifically stated, the processor in this article refers to MIPS, not x86.

How does the VMM (Virtual machine Monitor, or hypervisor) control the guest OS? Or, how does the Emperor (VMM) prevent the Minister (OS) from seizing power?


to understand how VMM works, we first need to understand how the system works without VMM. In the absence of VMM, "apps" in a computer system can be divided into user processes (such as VIM) and operating systems. They run separately in different modes (mode). We use an analogy to explain the pattern (mode) in the system. This mechanism is used to restrict the permissions of the environment in which an instruction is located, which is supported by the processor. MIPS structure at the time there are three modes: User mode, sir Mode (supervisor modes, forgive my translation ...) ) and kernel mode (kernel modes). These three models correspond to the homes, government office and throne room in human society. It is obvious that the people in Throne Room have the highest authority and the lowest in their homes. We can say that in the absence of VMM, user processes live in homes, county government empty, and the OS lives in Throne Room (). Users can do things that do not have any privileges (unprivileged instructions), such as calculating the. Do these things without going through the OS. However, there are some things that need to be privileged to do and must go through the OS. For example, save the document being edited to the hard disk (IO operations). Accessing a hard disk is a privileged operation because the hard disk is a shared resource and someone has to be in control. Imagine, if everyone can read and write archives, it is not a mess.


VIM (user process) cannot write directly to a file on disk. Vim must request the operating system in the throne room to do it. How do you request it? Through system call. The process of system invocation is this: for example, to call write (FD, buf, Len, off), first put FD, buf, Len, off into the stack, and then a write () function corresponding to the number (System call Number) into the stack, and finally call a special instruction to get the CPU into kernel mode (circled 1 in the diagram). In the x86 structure, this instruction is an int (interrupt). In the MIPS structure, this command is a trap. This instruction directs the CPU to execute a piece of code (trap handler), finds the corresponding function according to the system call number in the stack (here is the implementation of write (), and then calls the function (the function's arguments are in the stack just now). In this function, the write () implementation of the file system where the operating system calls the file is implemented, and the file system uses the disk driver to achieve the final implementation. So after the operation of the file is completed, the operating system calls an instruction opposite the trap function, returning to the instructions of the VIM program (circled 2 in the figure).

About Trap (TRAP) directive: Call this command trap (TRAP) is very image and accurate. When the trap command executes, it is like falling into a trap set by someone with more privileges (such as the Emperor), Mercy.

The process of



operating system to maintain its own privileges is probably the case. But how did this privilege level come true? We can imagine that there is a privileged state in the CPU (the privileged bit). When the status is on, the CPU has a high level of privilege and you can do anything, including going to a low privileged state. When you turn off, you are limited in what you can do. How does that go from off to on? You have to perform special instructions to the CPU, which will take you to a specific place to execute the operating system instructions to check if you have access to a high privileged state. It's like being at the airport, you can easily get from the gate to the ticket hall, but you have to go through the security check from the ticket hall to the boarding gate. And why is it that the operating system can have high privileges? Hmm ... Because the operating system takes that up from the start, the application that runs after it can only listen to it.

Now, finally, it's time to say how VMM (virtual machine monitor, or hypervisor) is implemented. For example:

VMM is now in Throne Room with the highest privileges. The operating system was put into the government office (but it did not know it). The user process is still living in the house. What happens if the user process Vim calls write () now? In this case, the user process traps to the VMM (owning kernel mode). However, VMM does not know how to handle write (). Therefore, VMM then invokes the corresponding trap handler in the operating system, which executes the file system's write () handler. During execution, the file system's non-privileged instructions can be executed directly on the real CPU. If the file system runs a privileged command, the CPU will go from the operating system (Supervisor mode) trap to VMM (kernel mode), and the VMM emulation (emulate) operating system will run the privileged instructions. There are several reasons why you should be emulated by VMM. First, VMM does not trust the operating system. For example, several operating systems may be running concurrently on VMM, and if one of the operating systems is instructed to perform direct memory access, it may modify the code or data of another operating system in memory. Second, the operating system only understands the virtual environment and does not know the actual environment. For example, the operating system wants to write a 10th sector of/DEV/SDA3. However, in fact/dev/sda3 may correspond to a file called/fake_dev/sda3.data in VMM. At this point, we need the VMM to translate the operating system's instructions into a real-world operation.

After the privileged instructions are emulated, the CPU returns to the operating system. When the write () system call executes, the operating system invokes a privileged instruction (the Rett instruction in MIPS), attempting to return to user mode. At this point the CPU traps to VMM and the final schema transformation is done by VMM.

Why does VMM know where the trap handler of the operating system is? The system has VMM first. When you install the operating system on VMM, the operating system tries to invoke the privileged command to install trap handler. Because the most privileged VMM is actually monitoring it all, it can record the location of the trap handler.

In general, VMM occupies the highest privileges on the CPU, making it impossible for operating systems at lower privileged levels to perform harmful operations and to complete the requirements of the operating system in a real-world environment.

With so many operating systems running together, how is memory managed?

In the absence of VMM, there are two memory addresses in the system: virtual address and physical (physical address). There are two ways to convert from a virtual address to a physical address. Method one: Found in TLB (translate lookside buffer, hardware implementation). Method Two: Find in the page table, and then put the results in the TLB. The system will try mode one first, if not found (TLB miss), use mode two.



With VMM, there are three memory addresses in the system: Virtual addresses, physical addresses (physical address), and machine addresses. The machine address is really the corresponding address one by one on the memory strip. The physical address is just the physical address that the operating system considers.

When the operating system tries to use a privileged command to complete the conversion of a virtual address to a physical address (TLB Miss), VMM is involved (VMM monitors all operations on the privileged registers). VMM first uses the code within the operating system to first complete the conversion of the virtual address to the physical address (because VMM does not know about this mapping relationship). The operating system then thinks that it has completed the conversion and tries to update the TLB (privileged operation). At this point, VMM will intervene, using a mapping table called a pmap to find the machine address corresponding to the physical address, replacing the physical address with the machine address, and then updating the TLB to the virtual address to the machine address mapping. All access to this virtual address is then converted to access to the corresponding machine address. (Note that MIPS uses the software-reloaded tlb,x86 with the hardware-reloaded TLB)

How are the resources shared between multiple VMS? Or, how can 1GB memory be used for 2GB?

We know that every virtual machine consumes a lot of memory space. How do you run more virtual machines on a single machine with limited memory? Fortunately, in-memory data between unused virtual machines may be exactly the same (for example, the system files are cached in memory). We can save a lot of space if we want to keep only one copy of the data in memory. Disco uses virtual IO devices and virtual network devices to save memory space.


Virtual IO Devices: When two virtual machines read the same file from the same disk, VMM intercept DMA and then discovers that the two VMs are using the same data. This data only needs to save a copy in the machine memory, and then modifies the pmap so that the physical address of the two VMS points to the same machine address. When any VM updates this data, VMM will give it a new copy, the original one does not make changes (copy on write mechanism).

Virtual Network device: When you use NFS to copy files from VM1 to VM2, the files are not actually copied. The virtual network device updates the PMAP on the VM2 to point to the in-memory file so that the operating system on the VM2 thinks it already has the file.

Later, VMware also has the use of hashing to find the same memory page and then share the technology.

How is a virtual machine implemented?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.