Honest people
I haven't written any documents for a long time. Today I will summarize some of my previous experiences on reverse analysis. I am still idle (this article does not contain any new technologies, I just want to sum up the knowledge from all parties and my own experiences, so you don't have to read it ).
I think the essence of software security defense is the struggle between analysis and anti-analysis. No matter how advanced the security technology is, once the intent of the Code is mastered by the analysts, there is no security. For faster analysis programs, you must not only understand a variety of software protection measures, but also have a clearer understanding of reverse analysis technology. The following is a brief introduction to my experience in reverse engineering.
In my opinion, reverse analysis can be divided into three aspects: code structure, data structure, and operation. The following is a detailed description (limited by space, the following is only an outline model, details can be found on the Internet ).
Note: The processing logic of the Debug and Release versions is the same, so there is no special distinction below. However, the implementation method is quite different. I am still exploring many Release version code processing methods, so I will not write it.
I. code structure.
The code structure determines the execution process and data trend of the program. First, understand the code structure of the program, and outline the framework of the entire program. Next we will describe each part. It can effectively improve the reverse speed. Code structure analysis includes the following points:
1. Comparison operation.
A. Comparison of signed numbers.
B. Comparison of unsigned numbers.
C. condition codes (I cannot remember the combination of various condition codes. You can column them into tables for query. Usually, there is not much to use, unless you need to precisely restore the Code ).
2. Condition branch.
A. single branch condition (if ).
B. Double branch condition (if-else ).
C. Switch condition.
D. Combine conditions.
E. Perform pure arithmetic operations to implement logical branches (for more information about condition code judgment after calculation, see the third edition of encryption and decryption ).
F. SETcc ).
G. Conditional transmission command (CMOVcc ).
3. Loop.
A. Execute A loop (do-while) first ).
B. then execute the loop (for/while ).
C. Loop Control and break/continue.
4. functions.
A. Fast identification of function calls and functions.
B. Call conventions.
C. parameter transfer method.
D. Stack balancing.
E. Export and Import functions.
5. code optimization.
A. Code speed optimization.
B. code size optimization.
Ii. data structure.
After the code structure is identified and the entire code framework is outlined, the next step is to identify various data structures in the program. For example, the program is a big tree, and now we have branches. What we need to do is add leaves. The data structure analysis is divided into the following points.
1. Stack (the stack usage of Debug and Release is quite different ).
2. Global variables.
3. Local variables.
A. Recognition of local variables in the stack.
B. register variables.
4. Import and export variables.
5. constants.
6. array.
A. Simple array.
B. Multi-dimensional array.
7. struct and consortium.
8. linked list.
A. One-way linked list.
B. Two-way linked list.
C. Circular linked list.
D. Binary Tree.
E. I have never reversed high-level linked lists such as graphs. Here is just a description of the concept.
9. Class.
I am not very familiar with restoring the class of the Release version program. Analysis of unused Class Members of the Release version is still being explored.
A. member variables.
B. Common member functions (it is easy to confuse with the default constructor of the compiler ).
C. virtual function table and inheritance derivation.
D. Identify the relationship between classes by using constructors.
E. destructor.
F. Scope of the class.
3. operation.
After the data structure is identified, the branches and leaves of the program tree are available, but the tree is still not active. Computation closely integrates data structures and code structures like blood. After the computation is completed, this tree is a living tree that can complete various life activities. I mainly divide computing analysis into the following points.
1. logical operation.
2. Data type conversion.
A. Zero scaling.
B. symbol extension.
3. floating point operation.
A. data format.
B. FPU register.
C. floating point operation.
4. Integer Operation.
A. addition and subtraction.
B. multiplication and division.
C. modulo operation.
D. 16 and 32-bit operations.
E. Big number calculation.
5. Flag position.
A. Overflow flag (CF/).
B. Zero sign position (ZF ).
C. symbol flag (SF ).
D. Parity flag (PF ).
6. code optimization and identification of operations.
Of course, the real reverse code analysis will never be rigid in the above order, but it will be used flexibly. The above is purely the basis. After all these are mastered, the reverse speed of the code will be significantly improved. This knowledge can be organized in the brain to quickly locate the desired part just like a book's directory (everyone's way of thinking is different, and I don't know how others think faster ).
The above framework is also a learning process of my own. It seems similar to program design. I learned how to use various processing methods of C and C ++ programs to turn around 1.1 points. After learning it all over, I naturally have a simple framework. I Think reverse analysis is the foundation. If you are familiar with it, you can speed up the learning progress in other aspects, such: vulnerability Analysis and shell analysis. When cutting firewood without mistake. This article aims to inspire others and hope that some experts can provide me with some restoration experiences on the Release class.