Discussion on protection technology of virtual machine

Source: Internet
Author: User

Reproduced in the Snow watching forum

The contents of encryption and decryption are summarized, re-summed up, added to their own understanding, hope to be helpful to the novice.


The third edition of Encryption and decryption, page No. 471
Virtual Machine Protection Technology
Virtual Machine Overview
The so-called virtual machine protection technology, refers to the translation of the code into a machine and people can not recognize a string of pseudo-code byte stream, in the implementation of these pseudo-code one by one translation interpretation, and gradually revert to the original code and execution.
This subroutine, which is used to translate pseudo-code and is responsible for executing it, is called a virtual machine VM (like an abstract CPU). It exists in the form of a function whose parameters are the memory address of the byte code.
Apply a virtual machine to a commercial case existing three models: Vmprotect,themida and Execrypt.
Virtual Machine Architecture
We know that the instructions in the Code are varied and the organization is changeable, and the virtual machines cannot be translated for every particular situation. All the possible directives must be abstracted and then decomposed into simple small instructions, which are then processed by individual specialized subroutines (handler).
Students who have learned the principles of compiling should know the ternary code, also called the 3 address code (three adress). That is, no matter how complex the assignment formula, can be decomposed into a number of 3 address code-type sequence.
(What is the 3 address code, 1 segment 3 address code completes only 1 operations, such as 1 times two mesh, 1 comparisons, or 1 branch jump operations.) )
Similarly, no matter how complex an instruction is, it can be decomposed into a sequence of non-divisible atomic instructions.


The architecture of a virtual machine (CPU) can be divided into 3 stacks-based (stack based), register-based (register based), and 3 address machines. We only talk about the stack-based virtual machine architecture (stack based), which requires a frequent operation of the stack, the virtual registers (virtual eax, ebx, and so on) that are stored on the stack, and the handler of each atomic instruction requires push, pop.
Now the CPU has a large number of registers, the stack is generally only used when the function passed parameters (such as the PC x86 series CPU). But there are some CPUs that operate only memory, no stacks, and no registers. A machine using this CPU is called a 3 address machine.
Stack-based CPUs or virtual machines do not have the concept of temporary variables, registers, and everything is put on the stack. Because the instruction does not need to specify the operand, its instruction is relatively short based on the register. Therefore, it is relatively simple and more widely used in embedded systems. To protect the code, we also choose this.
For example, such as command Add, the stack-based CPU first pops two numbers from the stack, then adds two numbers, and then puts and push to the stack. The Add directive consumes only 1 bytes. The register-based CPU directive is add REG1,REG2, which requires 3 bytes. Think carefully about the CPU without the register, what its instructions are, how concise it will be. Of course, the disadvantage of simple instruction is inefficiency.
The virtual machine protection technique we're talking about here is to change the register-based CPU code to the pseudo-code of the stack-based CPU. The pseudo-code is then interpreted by the stack-based virtual machine (CPU).
Instruction System
The key is to design a virtual stack-based virtual machine (CPU) instruction system. The more concise the command system, the better the reusability.
Or take the add command as an example. The add command for the X86 series CPUs has many formats, such as Add Reg,imm, add Reg,reg, add Reg,mem, add Mem,reg, and so on. The stack-based virtual machine CPU does not have so many tricks, just a single add instruction, parameters and returns are all in the stack.
We need to implement such an add command for our virtual machine CPU simulations:
Vadd:; Virtual add
Mov Eax,[esp+4]; take the source operand
Mov EBX,[ESP]; Take the purpose of the operation
ADD Ebx,eax;
add esp,8; Remove parameters from the stack, balance the stack
push ebx; Press the result onto the stack
Instead of the original add command parameters, we need to translate the push command. Depending on the object of the push, different implementations are required:
VPUSHREG32:; register into the stack. ESI points to the memory address of the bytecode
Mov Eax,dword Ptr[esi]; Get the offset address of the register in the VMCONTEXT structure from the pseudo code (byte code)
ADD esi,4; The VMCONTEXT structure preserves the values of each register. The structure is saved inside the stack.
Mov eax,dowrd ptr [edi +eax]; Gets the value of the register. EDI points to the base address of the VMCONTEXT structure
push eax; Press into stack
JMP Vmdispatcher; task complete, Jump back to task dispatch point
VPushImm32: Immediately count into the stack
Mov Eax,dword Ptr[esi]; bytecode, no translation.
ADD esi,4
push eax; immediately count into the stack
JMP Vmdispatcher
There is a push command, you have to have a pop command:
VPOPREG32:
Mov Eax,dword,ptr[esi]; Get the offset address of the register in the VMCONTEXT structure from the pseudo code (byte code)
ADD esi,4
Pop dword ptr [Edi+eax]; Bounce back Register
JMP Vmdispatcher
The stack-based virtual machine instruction system is simple: a single-byte action instruction (such as add, DEC), and a variety of push, pop, and other stack operation instructions. There are no complex registers and memory operations. We need to translate the x86 CPU instruction into the instructions of the virtual machine CPU, for example:
ADD Esi,eax
The instructions to convert to a virtual machine are as follows:
VPUSHREG32 Eax_index
VPUSHREG32 Esi_index
Vadd
VPOPREG32 Esi_index; Do not play eax_index, it is stored in the stack as a return result
Instructions involving jumps: (e.g., jmp, call, Ret)
ESI points to the address of the current pseudo-code, just like the IP register of our virtual machine CPU, pointing to the address of the current instruction.
The process of program execution can be changed by modifying ESI. Such as:
VJMP:
Mov Esi,dword ptr [ESP]; the parameters of jmp are in the stack
ADD esp,4; balance Stack
JMP Vmdispatcher
The call command is a bit cumbersome because the call function is not necessarily the pseudo-code of the virtual machine. Therefore, the call command, you have to exit the virtual machine, to the real CPU to deal with. The code resembles the following:
Vcall:
Push all Vreg, all virtual registers (maintained on the stack)
Pop all Reg, pop into the real register (save the running result of the virtual machine)
Push returns the address, which can be returned to the virtual machine after the call is completed. See the original book in detail on page No. 480.
Push the function address to invoke
Retn
Here are the VRETN:
VRETN:
Xor Eax,eax
Mov ax,word ptr [esi]; retn operand is of type word
ADD esi,2;
Mov Ebx,dword PTR[EBP]; Get the address you want to return
ADD ebp,4; free space
ADD Ebp,eax; If there are operands, also release
push ebx; Press-in return address
Push ebp; Press into stack pointer
push [edi + 0x1c]; Press into the value of the virtual register one by one
Push [edi + 0x18]
Push [edi + 0x14]
Push [edi + 0x10]
Push [edi + 0x0c]
Push [edi + 0x08]
Push [edi + 0x04]
Push [edi]
Pop eax; eject to register
Pop ebx
Pop ecx
Pop edx
Pop esi
Pop EDI
Pop EBP
Popfd
Pop esp; restore stack pointer
Retn



For the rest, attention should be paid to the processing of the flags, non-impersonation instructions, and instruction optimization. and exception handling, and this is not going to unfold.



VSTARTVM is the gateway to the virtual machine, which holds the runtime environment (the values of each register), and the initialization stack (the variables used by the virtual machine are all on the stack).
Bytecode is pseudo-code, Vmdispatcher reads the pseudo-code one after the other and then sends it to each of the following sub-programs (Handler).
The shell procedure first interprets the known X86 instruction as a byte code, puts it in the PE file, and then deletes the original code.
Change to a similar code into the virtual machine execution loop.
Push bytecode; pseudo code address, as parameter
JMP VSTARTVM
VSTARTVM is the function that goes into the virtual machine, and its code is probably like this:
VSTARTVM:
Push EAX, the register is pressed into the stack, and then the pseudo-instruction is removed and placed in the Vmcontext; some random features can also be added to confuse
Push EBX
Push ECX
Push edx
Push ESI
Push EDI
Push EBP
Pushfd
mov esi, [esp+0x20]; esp+0x20 a parameter that points to VSTARTVM, which is the memory address of the pseudo-code
mov ebp,esp; ebp points to the current stack
Sub esp,0x200; 0x200 bytes in the stack, stored Vmcontext
mov edi,esp; EDI points to Vmcontext
Sub esp,0x40; it's here to get to the stack that the VM really uses, not necessarily 0x40 bytes
Vmdispatcher:
Mov Eax,byte Ptr[esi]; get pseudo code bytecode
Lea Esi,[esi+1]
JMP DWORD ptr [EAX*4+JUMPADDR]; Jump to handler execution, populated by the shell engine
; Jumps to the function table to simulate the execution code every time a byte is read. (The instruction of the stack CPU is short, 1 bytes is sufficient)
; JUMPADDR is a function table (a bit like a vtbl or Switch-case table),
Vm_end; VM End Tag
After the VSTARTVM is initialized, the stack situation is as follows:


Low Stack address ...
... --------
...    |
... 0x40 bytes of space, the stack space used by the VM (not necessarily 0x40 size)
...    |
...--------edi points here (Vmcontext)
...    |
... 0x200 bytes of space for storing vmcontext
...    |
... --------
Flags save the flag register EBP points here
EBP--------
EDI |
ESI |
EdX-Saved Registers
ECX |
EBX |
EAX--------
Pseudo code start address (virtual machine parameter) ESI points here (pseudo code address)
...
Stack High Address ...

EDI points to the address of the Vmcontext;esi pointing to the pseudo-code, and EBP points to the stack top of the real stack; These three registers are not changed within the VM.
Vmcontext is the virtual environment structure used by virtual machine VMS:
Struct Vmcontext
{
DWORD V_eax;
DWORD V_EBX;
DWORD v_ecx;
DWORD V_edx;
DWORD V_esi;
DWORD V_edi;
DWORD V_EBP;
DWORD V_EFL;
}
The VM uses stacks to save its own register structure, taking into account the compatibility of multi-threaded threads.
We all know the principle of stack balancing when shelling. Similarly, the virtual machine can not arbitrarily change the original stack address when executing the translated program code. It is also often necessary to check that the VMCONTEXT structure in the stack is not flushed out.


VCHECKESP:
Lea Eax,dword ptr [edi+0x100]
CMP edi,ebp; Compare the distance between EBP and EDI (Vmcontext)
Jl Vmdispatcher; If it is smaller, continue execution
Mov Edx,edi
Mov Ecx,esp
Sub ecx,edx; ecx = Esp-edi
Push esi; save ESI
Mov Esi,esp
Sub esp,0x60
Mov edi,esp; edi = esp-0x60
Push EDI; Save a new EDI address
Sub esp,0x40
Cld
Rep movsb; copy
Pop EDI
Pop ESI
JMP Vmdispatcher

http://blog.csdn.net/chence19871/article/details/49798701

Discussion on protection technology of virtual machine

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.