Static code obfuscation under Windows x86

Source: Internet
Author: User
Tags windows x86

0x00 Preface

The static disassembly King, without a doubt, IDA Pro, greatly reduces the threshold for disassembly, especially the excellent "F5 plug-in" Hex-rays can restore the assembly code to a C-like pseudo code, greatly improving readability. But personally feel that "F5 plug-in" can only be used as an auxiliary means, after the combination of dynamic debugging and static analysis, understand the whole function of the process and reuse F5 See "C Language" code is the best way. This article is about learning how to write "flower instructions" to interfere with Ida's static analysis and "F5 plugins".

0x01 Disassembly engine

The disassembly engine is the tool that translates the binaries into the assembly. There are two main types of disassembly algorithms: Linear scan disassembly and recursive descent disassembly.

The linear sweep algorithm takes the end of one instruction as the start of another instruction, starting with the first byte, scanning the entire code snippet in linear mode, and disassembling the daily instruction until the entire code snippet is completed. The main advantage is that you can overwrite all of the code snippets of the program, but without taking into account the data that might be mixed in the code, error-prone.

The recursive descent algorithm depends on the control flow of the program, depending on whether an instruction is referenced by another instruction to determine whether to disassemble it. For example to meet the conditional jump instruction, the disassembler chooses a disassembly from true or false two branches, if it is normal code, the disassembler first chooses true branch or False branch, the output assembly code does not have any difference, but after meeting the artificial code "flower Instruction", The two branches of the same piece of code often produce different disassembly results. When a conflict occurs, the disassembler chooses the trusted branch preference, and most of the disassembler for the program-control flow first chooses the false branch.

For more specific information about the disassembly engine, please refer to the article on the Snow Forum:

"Various open Source engine, Disassembly engine contrast": http://bbs.pediy.com/showthread.php?p=1401094#post1401094

0x02 cheat "F5" Hex-rays

The simple "Push+ret" combination (like jmp) does not deceive Ida at all, and it is easy to be F5 to restore the "C language". and the extra memset () function is the code that f5 the compiler to automatically open up the stack of space.

We continue to change, replacing the "Push+ret" combination with another form:

The result was a bit disappointing, or was F5 directly restored.

So we continue to modify, jump down and then jump up, Ida will think it is a loop?

It seems to be a success, cheating Ida's F5. But a little peek at the assembly code makes it easy to see this loop jump

However, some of the features of the Ida F5 plugin can be summed up by the above examples:

(1) No problem with the JMP command analysis

(2) The combination of "Push+ret" instructions is treated as JMP

(3) Jump down and jump upward, will be considered a loop

(4) The analysis of the value on the stack manually through the register is weak, but simple can be analyzed directly.

Before all is in the function to jump, also can change the way of jumping, do the long jump between functions.

And go in, Saveregs will find out.

Well, it seems to have successfully deceived the F5 plugin. But reading the assembly code will find

Double-click into the _next, or was found our real code.

0x03 for disassembly engines

Before the jump command to confuse the "F5" plugin, that can make the results of the operation and the IDA analysis of the assembly is completely different? Here we need to insert some machine code to confuse the disassembly engine, for example, often used to insert some data in the middle of the code, so that IDA can not make effective distinction between data and code, the most common is 0xe8, because this is the first byte of the call instruction.

When "F5" is restored, the dialog box appears and the display cannot be "decompilation".

The disassembly result shows that the 0xe8 is interpreted as the first byte of the call instruction, and then becomes a called command, but Ida has been labeled red, and experienced people know that the code is "flowered" at a glance.

The main xor eax, EAX and JZ, this is a jump, why not directly written in JMP? Because the strategy of the anti-assembly engine for the control flow, which was mentioned before, was to go directly to JMP, 0xe8 would be identified, where it was artificially written "conditional jumps" to achieve the purpose of confusion. There are many ways to insert machine code, but some machine code can be used as the end of the previous instruction, but also as the beginning of the next instruction, such as just the E8 instructions, try not to use in the CPU can be executed, otherwise it is easy to let the program crash. We can look at a slightly more complex instruction insert

You can see that Ida's analysis is a bit disappointing, and "F5" is no use after that.

We can use more machine code to achieve the purpose of cheating IDA.

Disassembly assembly code:

Here Ida identifies 0x66 0xb8 as the first two bytes of the MOV instruction, and then causes subsequent parsing errors. In the middle of the 0xe8 can also add a large number of other flower instructions.

0x04 SEH

SEH structured exception handling, here is not much to do with the introduction, you can read the previous article Windows x86 Seh Learning, it should be explained that the compiler implementation of SEH and ordinary is not the same.

0x05 Destroy stack frame analysis

Ida tries to parse a function to determine its stack frame structure, especially when it encounters a ret/retn to reach the end of a function, so it is easy to forge stack frames to prevent static analysis. Add a RET 0xff to the previous code

The results of Ida F5 are considered to have 63 parameters.

There is also the change in the value of the ESP in the function: for example, here "CMP esp,0x1000", after the "Add ESP, 0x102" is never executed, here can change the ESP, can also do a lot of other confusing means.

Finally F5 shows that the stack frame has been destroyed, is not very familiar with, and previously called Pop,eax in the function of the results, all show that the stack frame has been destroyed.

0x06 Summary

Some of the most basic obfuscation codes for IDA and F5 plug-ins are listed, although they are basic, but can be stacked with very, very much confusing code to form a huge "flower instruction", the purpose of which is to increase the cost of the analysis. The most important thing about writing confusing code is stack balancing. If the code is confusing and protection is further studied, you can join the open source project on GitHub wprotect use "virtual machine technology" to protect your code. On the analysis of the compilation, individuals still feel that they can not rely too much on IDA and Hex-rays plug-ins, to go through the process of their own, familiar with the entire framework after the use of Ida and "F5" plug-in to improve efficiency.

Resources:

The IDA Pro authoritative guide

"Malicious Code Analysis Combat"

"Encryption and decryption"

Static code obfuscation under Windows x86

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.