Overview of existing software protection technologiesTraditional software protection technology can be divided into anti-static debugging and anti-dynamic debugging based on different objects. Anti-static debugging mainly targets the anti-assembler. The anti-assembler restores the binary files generated by the compiler into the assembly code through the disassembly engine (for example, the PC platform is the X86 disassembly engine) targeting specific platforms, experienced reverse engineers can use this to restore core computing mechanisms such as algorithms. Anti-static debugging protects the core information by encrypting specific segments, and dynamically restores the core information only through decryption and other algorithms at runtime, the anti-assembler is blocked from restoring binary files to sink encoding statically.
Anti-dynamic debugging mainly targets the debugger. Because the binary code that has been statically encrypted must be decrypted before execution, it can still be viewed through dynamic debuggers such as Ollydbg, anti-dynamic debugging prevents reverse engineers from tracking the running status of software processes through the debugger by detecting the debugger and shielding the debugging port, so that the running status of the software remains black.
Passive software protection concepts the above two protection schemes both adopt an active attack strategy with the intention of "protecting the enemy from outside the country". The central idea is a "Block character ", blocking reverse engineering is a glimpse of the internal mechanism of the software, but the confrontation between shield and spear is always endless, and there is no active software protection means to completely block reverse engineering, therefore, another "people-oriented" passive Software Protection Technology began to go to the center of the stage of struggle.
Passive software protection methods are based on the assumption that reverse engineers have broken through active software protection measures through various methods and can observe everything as they wish. At this time, they can no longer block them, only the "hidden" strategy can be adopted to improve the concealment of core code to improve the time investment of reverse engineering in the stage of reading the disassembly code, and indirectly play a role in software protection. Passive software protection methods are implemented in Disorder and confusion.
In disorder, jump commands, such as jmp, are added to the program execution stream, use these jump commands to split a complete code block executed from top to bottom into several code segments with inconsistent execution sequence. Disordered sorting can increase the difficulty of reading the disassembly code to a certain extent. However, it is easy to recognize and remove unconditional redirection if it is simply embedded, therefore, industrial-level protection products often increase the recognition difficulty by using conditional jump.
Obfuscation is to increase the difficulty of understanding disassembly codes by adding meaningless code (also known as flower instructions or spam Code) or meaningful code. As the saying goes, to hide a tree, the best place must be the forest. obfuscation technology is to increase the total amount of disassembly code through code expansion, and to construct a code "Forest" for hidden core code ". The obfuscation code added in the early stage is meaningless code, which achieves the execution Effect of nop-like operations, such as the combination of push and pop commands, but this type of code has no practical significance, the removal does not affect the operation of the Software. Therefore, experienced reverse engineers often first remove the meaningless obfuscation code before reading the disassembly code, which results in the obfuscation technology being ineffective. To ensure that obfuscation code is not removed, industrial-level protection products are more inclined to adopt meaningful obfuscation code. The core idea is to replace equivalent values. Multiple commands are used to achieve the effect of one command in the core code, for example, the simplest value Assignment Command moveax, 3 can be replaced with xoreax, eax; inceax. The operation is the same, but the number of commands has quadrupled. Whether passive software protection technology can play a substantial role in software security remains controversial in the industry. The objection is mainly due to the fact that passive Software Protection Technology only increases the difficulty and quantity of reading the disassembly code, making people "dazzled" and has no substantial effect. This article argues that software security should not be simply understood as making software absolutely secure, but it is actually a process of repeated game between the attacker and the defender, and between the input and output, the investment in attacking party manpower is naturally one of the costs. The number of codes read by a person per unit time is fixed. Therefore, it increases the difficulty and quantity of reading the disassembly code, and increases the time for reading the disassembly code, it increases the labor cost of the attacker and has a practical effect on software security.
Virtual Machine Software Protection TechnologyIdea of Virtual Machine Software ProtectionVirtual Machine Software Protection Technology is a branch of passive Software Protection Technology. Specifically, it is a variant for adding meaningful obfuscation code.
Virtual Machine technology is widely used in the software field. Based on Different Application Levels, it can be divided into hardware abstraction layer virtual machine, operating system layer Virtual Machine and software application layer virtual machine. The virtual machine used to protect software security belongs to the software application-layer virtual machine. The same-layer virtual machine also includes advanced language virtual machines, such as java program language runtime environment jvm and. net program language runtime environment CLR. The reason why the latter uses virtual machines is convenient for transplantation. Therefore, the compiler does not directly generate nativecode that can be directly executed on the machine, but instead generates the intermediate code byte-code, then, install the corresponding version of Virtual Machine in different machine environments to explain and execute the byte-code, so as to achieve cross-platform running.
Virtual machines used to protect software security adopt similar procedures. The virtual machine protection software will first "compile" the core code of the protected target program. Note that the source file is not compiled here, instead, it is a binary file -- and generates byte-code with equivalent results, and then adds a virtual machine interpretation engine to the software. When the user finally uses the software, the virtual machine interpretation engine reads byte-code and performs interpretation and execution to achieve consistent user experience.
Realization of Virtual Machine Software ProtectionCompile and generate byte-code
To design a set of virtual machine protection software, you must first design a set of virtual machine commands, that is, the byte-code instruction set table, to generate the byte-code process, actually, it is the process of converting the command stream of the original machine to the VM command stream.
The virtual machine instruction set table should meet the following two design principles:
The first design principle is that the more forward the virtual machine instruction set table and the original machine instruction set table, the better, the higher the security factor. The worst case is the one-to-one correspondence between the VM instruction set table and the original machine instruction set table. The security coefficient of the VM protection program using this Instruction Set approaches zero, for reverse engineers, only simple conversions are required to restore the original code.
Another design principle is to be as thorough as possible with Turing completeness, to fully express all possible expressions of the original machine instructions. The better the Turing completeness, the wider the protection coverage of the virtual machine protection engine, the higher the robustness. Ideally, the VM Instruction Set should completely replace the original machine instruction set with equal values, and completely meet the Turing completeness. But in fact, the complete replacement costs are too high or even less likely to be implemented, such as the x86 Instruction Set FCLEX, FPTAN and other commands, the simulation is more difficult, and the possibility of using such commands in the core code is very small, considering the cost-effectiveness ratio, virtual machine instruction sets generally do not cover these "uncommon" commands. For commands that cannot be simulated, you can exit the VM and obtain the execution result before entering the VM.Execute byte-code
When the software is running, the Byte-code generated by the compilation is interpreted and executed by the virtual machine interpretation engine embedded in the executable file of the software in read-Dispatch mode.
The virtual machine interpretation engine is divided into two parts: Dispatcher and handle.
The Chinese meaning of Dispatcher is "Dispatcher", which is equivalent to the CPU of the virtual machine interpretation engine. It reads Byte-code and assigns corresponding handle for interpretation. Handle literally means "processing". The actual function is to implement VM commands through the platform nativecode (for example, the PC platform is x86 command. The number of Handle commands is the same as that of the VM instruction set.
Entry and exit of virtual machines
The software protection virtual machine is not exactly the same as the Virtual Machine in advanced language. It is mainly reflected that the virtual machines in advanced language are all executed in the virtual machine environment, but the software protection virtual machine must undergo a switchover between the local environment and the virtual machine environment, to ensure the consistency of execution results, the virtual machine environment must be able to correctly obtain and restore the execution context of the local environment. A more convenient method is to use the stack machine model, that is, the virtual machine performs data operations based on the stack. Before entering the virtual machine, first press the local environment to stack, the virtual machine directly uses the stack address to execute the command flow operation, quit the virtual machine, and then one by one out of the stack, this ensures seamless context switching in different execution environments.
ConclusionVirtual Machine-based software protection technology can greatly increase the difficulty of reverse engineering code restoration. A set of well-designed software protection virtual machines can significantly increase the time required for code restoration, this increases the cost of reverse engineering and achieves the effect of software protection. However, the use of virtual machines does not benefit everyone. Like virtual machines in advanced languages, software protection for virtual machines also faces problems that may lead to lower execution efficiency. Security and efficiency are always in the same relationship, the specific emphasis can only be weighed based on the requirements of the production process.
From: http://www.weixianmanbu.com/article/125.html
Address: http://www.linuxprobe.com/software-protection-virtual.html