Discussion on software protection using virtual machine technology

Source: Internet
Author: User
Tags constant

http://www.paper.edu.cn
Discussion on software protection using virtual machine technology
Zhang Lu
School of Computer Science and Technology, Beijing University of Posts and Telecommunications, Peking (100876)
E-mail:dwingg@gmail.com
Absrtact: This paper presents a new method of software protection using virtual machine principle. This paper expounds the complete realization step of this method
The practical techniques that can be exploited in practical development are discussed in detail at each step. Comprehensive and flexible use of virtual
Machine principle and chaotic technology, the software protection strength will be increased to a new level.
Keywords: software protection, virtual machine, encoding, decoding, confusion
Middle Image classification Number: TP309




1. Introduction
With the widespread popularity of the internet, software piracy is becoming increasingly serious, the interests of commercial software and shareware authors are
Serious infringement. Although the release of software generally does not contain the source code, but with the continuous improvement of software reverse engineering technology, commonly used
IA-32 architecture has been studied extensively. Even binary machine code programs can still be analyzed to some extent,
This will invalidate the software registration and restriction measures.
In order to prevent the reverse engineering of software, there have been some protection measures [1]
, mainly through the application of the overall encryption
And how to detect the debugger. However, because the program must be decrypted under the condition of the runtime to run correctly, the debugger's check
Usually only before the original entry of the program. So this method is unreliable. This paper presents a new protection
Mechanism, which utilizes virtual machine technology to solve this problem effectively.
Virtual machine is relative to the physical existence of the computer, is a real computer with all the work
Able virtual Computer [2]
。 It can use software to simulate a set of instructions to be shipped on a hardware device using another set of instructions.
Line, there is no functional difference except for the slower running speed. It is mainly used to unify the platform of program operation, analyze and tune
Test procedures, simulation software and hardware. While software simulation is much slower than direct hardware execution, the hardware is constantly
and special needs, the technology is still widely used in research, business and personal applications. Software simulation refers to
A set can be a real hardware device, or it can be virtual. Because the analysis of software instruction needs to master its instruction set first,
So if the software contains an unusual or no documentation of the instruction set code, even a short piece of code, the analysis of the
will also face great difficulties. If we add some back analysis techniques to be put forward later, we will be very effective in realizing
Protection of the software.
2. Virtual Machine Software Protection process

Virtual machine software protection can be divided into two parts: the first is the protection section, that is, the code block to be protected into a virtual machine
Code, followed by the execution part, that is, after the release of the protected software, the virtual machine directive is correct when it is delivered to the user
To execute virtual machine code.
The overall process of protecting the original code is shown in Figure 1:

2.1  Extract the code block to be protected
because the amount of code after the conversion of the virtual machine will be greatly increased, and the execution speed is greatly reduced, so generally do not
need to transform the entire program, simply select the software registration or do not want to be reverse analysis and need to be confidential The
code block. In general, these parts are not many and do not require high operational efficiency.
due to the special nature of the virtual machine directive, it is very difficult to determine the correspondence of the virtual machine instruction with the other entry location if the command is executed in only one place. The general situation is in the function process body unit, can guarantee the generation of
code block only a single entrance.
Since most software is developed in high-level languages, in order to simplify locating blocks of code to be protected, it is possible to add a special tag that does not affect the program's operation before or after the high-level language is
protected, which facilitates the program to automatically locate and extract blocks of code to be
protected. The code block previously extracted by
2.2  machine instruction decoding
is the original machine instruction stream. In order to reduce the complexity of converting to virtual machine instructions, and to facilitate the
unification of the conversion of different platform machine instructions, the original machine instruction is transformed into an intermediate instruction which is easy to understand and analyze.
This intermediate directive has wide applicability and supports various platform directives. All intermediate directives use the uniform format
(instruction operator, first operand, second operand 、...... ) 。 The instruction operator is the core of the instruction, used for the action of the table
instruction, such as transfer, operation, jump and so on; each operand consists of (access word width, addressing method, address parameter
number), the operand access word width is generally 8 bits, 16 bits, 32 bits, 64 bits, etc., addressing the current CISC frame The addressing complexity of
can be summed up as immediate-number addressing, register addressing, and register addressing that support a variety of different addresses. Thus the
instruction flow of various platforms can be transformed and unified into an intermediate instruction sequence into the next transformation. The
2.3  instruction transformation
can be converted to a virtual machine instruction after it has the intermediate code in a uniform format, so this is the
most significant step in the entire protection process. The virtual machine instruction set and the hardware instruction set are different, the former is not limited by the hardware design, can
be advantageous to the software protection direction design as far as possible. To do this, make the following considerations:
(1) Select a thin instruction set. In order to increase the difficulty of instruction sequence analysis, it is more advantageous to choose RISC class instruction set, and can
simplify the conversion of code and the design and implementation of virtual machine interpreter. The
(2) expands the number of registers and unifies the register number, which contains all the special registers, such as instructions

and status registers.
(3) Cancel the jump instruction. This type of instruction can be implemented by direct read and write instruction indicator, in the absence of direct jump instruction
case, the analysis of the procedure flow will be more confusing.
As you can see, these tools will make the converted virtual machine instructions more, but this has no drawbacks for software protection.
2.4 Chaos
If the above methods are not enough to hinder the reverse analyst, then the converted virtual machine code can continue to do the strength
Controllable random chaos processing, which is also known as "code obfuscation [3]
”。 The main methods are as follows:
2.4.1 Indirect Use constants
Whether it is the original machine code or the direct conversion of the virtual machine instruction code, the constant in the instruction is unchanged, which will
Provide clues to the inverse analyst, especially the address offset constant of the indirect jump instruction. To eliminate these clues, you can put all
A single instruction using constants translates into multiple instruction operations generation. This generation algorithm is complex and changeable, and can be based on strong encryption
Random selection of degrees.
2.4.2 Meaningless command padding
There is no effect on the correct execution of the program by adding some trivial random instructions in the code, but for the reverse analyst
Will undoubtedly add a lot of interference to the analysis work. These meaningless instructions can be randomly interspersed on the basis of the encryption strength in a valid
The order.
2.4.3 Invalid command fill
In the virtual machine code logic will not execute to place, add incomplete or undefined instruction code, can some extent
Static reverse analysis on interference. Since the instruction set designed above does not have an explicit absolute jump instruction, it is generally possible to implicitly
After an absolute jump instruction, add a few bytes of invalid instruction.
2.4.4 Instruction Disorderly Ordering
According to two instructions if the same register, the same memory, and does not affect the status register and the command jump situation
The principle of interchangeable can be exchanged as far as possible to exchange adjacent instructions. Not only can the process of the program appear small confusion,
It can also make meaningless instruction and effective instruction to merge with each other, and strengthen the function of meaningless instruction. Figure 2 gives an instruction fragment disorder
Examples of sequencing:

2.4.5 Code Flow Warp transformation
General program Flow If there is no jump, it is the first and the next order of execution. If the order is disrupted, the continuous smoothing

The number of instructions executed by the order is reduced to a certain extent, making it more difficult to reverse the analysis and improving the invalid instruction
The fill rate. This conversion can be called "twist transformation [4]
", the transformation example is shown in Figure 3:

2.5 encoding
This step translates the virtual machine directives into binary code form. The general directive coding principle requires unification and has
A certain pattern to facilitate coding and decoding, which is particularly important in hardware implementations. But if you want to add software protection,
The difficulty of strong anti-coding, it is necessary to implement a special coding strategy.
The simplest and most efficient basic method is coding randomization. That is, each time the protection is implemented, the command operand is selected randomly, the register
Binary code that corresponds to the device number. This encoding can be one-to-many, that is, a variety of binary code corresponding to a directive;
To increase the random padding of redundant flag bits. These methods are compatible with the virtual machine interpreter.
2.6 Write back the original program file
The resulting virtual machine binary code is written back to the original program file, and the virtual machine is also written to explain
Manager Because after a series of transformations and chaotic operations, the amount of virtual machine code may be much larger than the original machine code, so
It is generally necessary to enlarge the space of the original program and attach the virtual machine code and interpreter to the idle area or tail of the program. Original Insured
can be emptied or stored in some virtual machine code or virtual machine interpreter (as shown in Figure 4) and
Write instructions for jumping into the virtual machine interpreter at the entrance of each protected code snippet, including the corresponding virtual machine code entry
Port address parameter.


The virtual machine code can do the compression and encryption processing again, and when the interpreter executes, it decrypts the decompression edge execution. But this
One step for software protection is not important, the analyst can easily trace the interpreter program and fully restore the compressed encrypted
Virtual machine code.

2.7 Virtual Machine Interpreter
The above 6 steps are the protection of the software, and this part will be independent and attached to the distributable program files.
The primary role is to interpret the virtual machine instruction set that was designed and dynamically generated above. Since it will be used primarily for Software Assurance in the first
Virtual machine instruction set is designed to be RISC class, so that the virtual machine interpreter can be written in a simple way. The complex of virtual machine code
The complexity is mainly based on the degree of code confusion. The virtual machine interpreter also takes care of entering and leaving the interpreter and calling the external thread
Save and restore the hardware registers before and after the sequence segment, the interpreter itself tries not to use the system stack, otherwise it will be associated with the virtual machine
The stack usage of the instruction is conflicting.
3. Summarize
Theoretically, as long as the software works on the user platform, there is no way to prevent the software from being modified by some means
Loss of protection [5]
。 But to delay the protection of the time to a certain extent can be achieved. Integrated application of Virtual machine technology and the use of
Its related various anti-analysis techniques, software protection will reach a new level.
4. Thanks
Thanks to my mentor, Professor Lacquer Tao, for the amendment proposed in this article.

Reference documents
[1] Xiongli, Dong Hengqing. Development status and prospect of software protection technology [J]. Journal of Software, 2005, (19): 41-44.
[2] gross speed. Application of virtual machines [J]. Ningxia Engineering Technology, 2003,2 (2): 154-156.
[3] Laurent, Kanjianchen, Zeng. Code obfuscation techniques for software protection [J]. Computer Engineering, 2006,32 (11): 177-179.
[4] Liu Tao Tao. Twist transform encryption [Eb/ol]. Http://liutaotao.com/nqby.txt,2006-7-7.
[5] Shi Lijuan. Research on software protection Scheme [J]. Agricultural Network Information, 2006, (6): 124,125,129.

Software Protection Using Virtual machine
Zhang Lu
Department of computer Science and technology,beijing University of Posts and
Telecommunications,beijing (100876)
Abstract
This paper provides a new to software protection using virtual machine theory. and describes the
Whole process of implement of this. Then discusses detailedly some actual technique in every step
of the development. Making use of virtual machine theory and confusion technique integratedly and
Deftly, can improve software protection strength to a new level.
Keywords:software protection,virtual machine,encode,decode,confusion





Author Profile: Zhang Lu, male, 1982, Master's degree, the main research direction is data compression and encryption, while the
Software protection technology has a deep understanding and research.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.