Full virtualization Technology of CPU pure software

Source: Internet
Author: User


In the previous article, we mentioned the general classification of virtualization technology, which is divided into 3 categories: full virtualization, semi-virtualized, and hardware-assisted virtualization technology. And the main virtual body of our virtualization technology is our hardware CPU, memory and IO, so how does our CPU work in full virtualization mode? How does the semi-virtualized work? How does it work in hardware-assisted virtualization mode? Or to say subdivision, we can be divided into CPU full virtualization technology, semi-virtualization technology and hardware-assisted virtualization technology, the full virtualization of memory technology, semi-virtualization technology and hardware-assisted virtualization technology, as well as the full virtualization of IO devices, semi-virtualized technology and hardware-assisted virtualization technology. This time we will talk about the full virtualization of CPUs, semi-virtualized technology and hardware-assisted virtualization technology.



The CPU in the X86 architecture that does not support hardware-assisted virtualization technology has 4 privilege levels (RING0--RING3), the operating system is at the highest level of RING0, and the application is at the lowest ring3.

650) this.width=650; "src=" Http://s2.51cto.com/wyfs02/M02/79/F2/wKioL1afFnSRpxlQAABjQiFvejo026.png "title=" 1.png " alt= "Wkiol1affnsrpxlqaabjqifvejo026.png"/>

It is difficult to achieve full virtualization of CPUs in this architecture, and why is it difficult?

1. the original OS runs on the RING0 layer and has all the privilege levels for all hardware;

2. after virtualization runs the OS on the RING1 layer, the OS does not have permission to execute some privileged instructions, and how to ensure that these privileged commands are executed;

3. ensure the security of other running OS virtual machines in the case of guaranteeing the operation of the OS virtual machine's privileged instructions;

1. Simulation Technology

The first implementation of this CPU full virtualization technology is trap-and-emulation technology, that is, into the mode and simulation technology. This technique is returned to the OS by running the privileged instructions for OS requirements through the VMM Auto-capture mode. When a privileged command from the OS is generated, VMM automatically captures it, intercepts the privileged commands requested by the OS, and then returns the results to the OS layer after running through VMM. VMM uses simulated emulation to simulate the execution of privileged instructions on one side.

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/79/F3/wKiom1afFk-T2kDjAACQyVyCch0212.png "title=" 1.png " alt= "Wkiom1affk-t2kdjaacqyvycch0212.png"/>

In virtualization mode, there are special directives in 2: Privileged commands and sensitive instructions. So what is a privileged command? What is a sensitive directive?

Privileged directives: There are instructions in the system to operate and manage critical system resources, which operate correctly only at the highest privileged level. If run at a non-maximum privilege level, the privileged instruction throws an exception and the processor falls to the highest privileged level and is referred to the system software for processing. Directives perform differently at different runlevel, and not every privileged instruction throws an exception, and it can be ignored directly. The hardware distinguishes between privileged and ordinary commands, for example, just as the operating system provides system calls.

Sensitive directives: Instructions for manipulating privileged resources, including modifying the operating mode of the virtual machine or the state of the underlying physical machine, reading and writing clocks, interrupts, etc., access to the storage protection system, address relocation system, and all I/O directives. A privileged instruction that does not trigger an exception is also a sensitive instruction; that is, the scope of the sensitive instruction is larger, and some sensitive instructions do not trigger the exception, which is the design reason for the x86 design.

Virtualization scenarios require that the privileges of the Guestos kernel be lifted from the original 0 down to 1 or 3. When this partial privileged instruction occurs in Guestos, a trap is generated that is captured by the VMM and is then completed by VMM. This is the virtual nature of the method, the privilege lifted and plunged into the simulation (Privilege deprivileging/trap-and-emulation). Sensitive instructions in a virtualized scenario must be captured and completed by VMM. For general RISC processors, such as MIPS,POWERPC and SPARC, sensitive instructions are definitely privileged directives, but x86 exceptions, x86 most of the sensitive instructions are privileged instructions, but some sensitive instructions are not privileged instructions, and when executed, they are not automatically trapped by VMM.

2. Binary translation Technology

Simulates and virtualize the CPU of the x86 architecture in a simulated manner, but because not all sensitive instructions are privileged instructions in the CPU of the x86 architecture, it is not possible to completely solve the simulation problem of sensitive instructions that are not privileged instructions. such as SGDT, SLDT, SIDT ...

Because of the inherent flaw of simulation technology, the virtualization of CPU is not complete. As a result, x86-based virtualization is difficult to virtualize with other CPU architectures. IBM's power CPU architecture, for example, was early in the process of virtualizing technology and making it practical.

This phenomenon was improved in 1999, and VMware completed the full virtualization of the x86 CPU architecture with binary translation technology.

It mainly uses the priority compression technology (Ring Compression) and Binary Code translation technology (binary translation). The priority compression technology lets VMM and guest run under different privilege levels. For the x86 architecture, where the VMM is running at the highest privilege level ring 0, Guestos runs under ring 1 and the user application runs under Ring 3. Therefore, the core instruction of the Guest OS cannot be directly down to the hardware execution of the computer system, but it needs to be captured and simulated by the VMM (some of the hard-to-virtualize instructions need to be translated by binary translation "binary translation" technology). As shown in.

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/79/F2/wKioL1afFsDjjbbTAAAl_Q4fU1I877.png "title=" 1.png " alt= "Wkiol1affsdjjbbtaaal_q4fu1i877.png"/>

Privilege level I don't want to say it here, and everyone is more clear about the binary code translation technology that you might not know. Binary translation technology, known as BT, is a direct translation of executable binary technology, capable of translating a binary program on a processor to another processor execution. Binary translation technology maps machine code from the source machine platform (translation) to the target machine platform, including the mapping of instruction semantics and hardware resources, so that the code on the source machine platform "adapts" to the target platform. The translated code is therefore more adaptable to the target machine and has higher run-time efficiency. The binary translation system is a software layer between the application and the computer hardware, which is a good way to reduce the coupling between the application and the underlying hardware, so that they can evolve and change relatively independently. Binary translation is also a compilation technique, which differs from traditional compilation in that it compiles processing objects differently. Traditional compiler processing object is a high-level language, compiled processing to generate a machine's target code, binary translation processing object is a machine binary code, the binary code is generated through the traditional compilation process, after the binary translation processing to generate another machine binary code.

According to different implementations, binary translation technology can be divided into three main categories: interpretation, static and dynamic translation.

ü Code Interpretation Execution

The interpret Execution (interpretation) procedure interprets execution in real time for each instruction in the source machine code, does not save and does not cache interpreted instructions, does not require user intervention, and does not perform any optimizations. The interpreter is relatively easy to develop, and it is easier to be highly compatible with older architectures, but with poor efficiency.

ü static binary translation

In a static binary translation (Sbt,static binarytranslation), the code is translated offline before it is run, a new program is generated based on the instruction structure of the target machine, and then the program is executed directly after the translation. The offline translation process of a static translator does not add additional overhead to the program's operation, so it is possible to make full use of various optimizations to produce high-quality code, which greatly improves run-time efficiency.

ü Dynamic binary translation

Dynamic binary translation (dbt,dynamic binarytranslation) translates code fragments that are executed when the program runs, overcoming some of the difficulties that static translation cannot solve, such as runtime dynamic information collection, code mining, self-modifying code, and precision interrupt issues. and dynamic translators are completely transparent to the user without user intervention. Although dynamic translation has many advantages, the translation process can not be completely and meticulously optimized as static translation due to the limitation of dynamic execution, which makes the code efficiency of translation less than that of static translators.

Comparison of three kinds of binary translation techniques

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M01/79/F3/wKiom1afFp-iEdF_AACk-KSZCrc149.png "title=" 1.png " alt= "Wkiom1affp-iedf_aack-kszcrc149.png"/>

Interpretation execution is one of the most easily implemented translation techniques, but its cumbersome implementation greatly reduces the efficiency of translation system execution. While static translation can provide efficient runtime performance, it cannot be separated from the interpreter because it cannot overwrite all the code in a static environment. Compared with the above two, dynamic translation solves many problems, such as code coverage, self-modifying code and precise interruption, and also provides acceptable execution efficiency. Therefore, VMware is based on dynamic binary translation technology to achieve the x86 architecture of the CPU virtualization.

650) this.width=650; "src=" Http://s1.51cto.com/wyfs02/M01/79/F2/wKioL1afFvXSPZvRAAC6ySjX1WQ706.png "title=" 1.png " alt= "Wkiol1affvxspzvraac6ysjx1wq706.png"/>

The typical dynamic binary translation system architecture shows that the translated code is called the source machine code, and the code that runs on the host is called the Target machine code, and a typical dynamic binary translator consists of two modules: the translation engine and the execution engine. The translator engine is responsible for translating the source machine code translation code into the target machine code; the execution engine prepares the context for the target machine code to run (execution context) and then finds the target code corresponding to the source machine code from the target machine code cache and executes it.

The basic operating procedures are as follows:

ü Find (Lookup) stage

This phase queries whether the target code block exists in the target code cache, returns the destination block entry address if it exists, and enters the translation phase if it does not exist.

ü Context Switch Stage

When a target code block is queried or the translation module is generated, the binary translation system performs a control transfer. The system will give control to the execution module to run the target code block, after the target code block has finished running, the system needs to restore the execution engine control. A control transfer requires the context of the stored program.

ü translation (translation) stage

Complete the translation from the source machine binary code to the binary code of the target machine. Including decoding, intermediate code optimization, encoding three sub-stages.

ü execution and linking (executing & linking) stages

After the base block is translated to generate the target code block, follow the source code control flow to complete the direct and indirect links between the target code block, and then run the target code block in turn.

3. Summary

In the absence of CPU hardware-assisted virtualization technology, the CPU of the X86 architecture is virtualized with the technology of analog and binary translation, but there is inherent flaw in the way of simulation, and the x86 CPU architecture is not completely virtualized. Binary translation technology uses a completely different approach to realize the CPU virtualization of the x86 architecture. In fact, for x86 CPU virtualization, the difficulty lies in the implementation of its privileged instructions and sensitive instructions virtualization, of course, after the implementation of the CPU instructions this difficult problem, there is a problem waiting for us! Is that the CPU scheduling problem of the x86 architecture?

In a virtualized environment, what is the CPU scheduling problem for the x86 architecture?

1, the corresponding relationship between virtual CPU and physical CPU?

2. Resource allocation between the virtual CPU and the physical CPU?

3, the priority between virtual CPU and virtual CPU?

4. Multi-core virtual CPU Architecture VSMP and Vnuma with physical multi-core CPU architecture scheduling and load balancing between SMP and NUMA?

We leave this question on the fourth day, and tomorrow we'll talk about the x86 architecture of CPU para-virtualization technology. We'll talk about the day after tomorrow. X86 architecture's CPU hardware-assisted virtualization technology. On the fourth day we'll talk about CPU scheduling.

Want to know how to funeral, and listen to tell!


This article is from "I take fleeting chaos" blog, please be sure to keep this source http://tasnrh.blog.51cto.com/4141731/1736758

Full virtualization Technology of CPU pure software

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.