Python reverse tutorials and python tutorials

Source: Internet
Author: User

Python reverse tutorials and python tutorials
1. Development Environment

We started the python reverse journey on Windows 10. First, we started to build a development environment. The python interpreter used the latest 3.6.1 and the IDE used the PyCharm Community version 2017.1.3, as shown below, after the download is complete, double-click the installation package and install it. Then, set the Project Interpreter of PyCharm to the Python Interpreter you just installed.

Python https://www.python.org/downloads/
[PyCharm] http://www.jetbrains.com/pycharm/download/#section=windows

2. ctypes

Ctypes is an external function library for Python. It provides data types compatible with C language, allows you to call functions in dynamic link libraries or shared libraries, and can wrap these libraries. The following describes the relationship between the data types in ctypes and the data types in C and Python.

The data types in ctypes are all implemented by class. loading the C library in Python involves the following classes.

  1. Class ctypes. CDLL: load the shared library, and use the standard C function call Convention cdecl. The return type is int.
  2. Class ctypes. OleDLL is used to load shared libraries only on Windows platforms. It uses the stdcall function call convention and the return type is HRESULT.
  3. Class ctypes. WinDLL is used to load shared libraries only on Windows platforms. It uses the stdcall function call convention and the return type is int.
  4. Class ctypes. PyDLL is similar to CDLL. Unlike the previous three, GIL and Global Interpreter Lock are not released during function calls.
  5. Class ctypes. LibraryLoader (dlltype) dlltype is CDLL, OleDLL, WinDLL, PyDLL. This class has a function LoadLibrary to load the shared library.

A simpler way to load the C library is to use the following pre-created class instances.

ctypes.cdllctypes.oledllctypes.windllctypes.pydllctypes.pythonapi

The above mentioned function call conventions cdecl and stdcall. cdecl means that function parameters are pushed from right to left into the stack. Function callers are responsible for balancing functions after function execution is complete, the return value is stored in the EAX register in the C language of the X86 architecture. From the perspective of assembly code, function parameters are pushed to the stack from right to left, and then the function is called, finally, change the stack pointer ESP to the original position. Stdcall, the order of parameter passing is also from right to left, but the stack balancing processing is done by the function itself, rather than the caller. The returned values are also stored in EAX, that is, the stack pointer ESP does not move like cdecl after function parameter pressure stacks and function calls.

The following example calls C's printf function in Python. printf belongs to "C: \ Windows \ System32 \ msvcrt. dll", that is, "libc. so" in Linux ".

from ctypes import *msvcrt = cdll.msvcrtmessage = b"Hello World\n"msvcrt.printf(b"Message is %s", message)

The above code outputs "Message is Hello World ". In addition, ctypes also allows definition of structures in Python and other advanced features such as Union. For details, see https://docs.python.org/3.6/library/ctypes.html? Highlight = ctypes #.

3. debugging Principle

Using the debugger, You can dynamically track and analyze programs, especially when exploit, fuzzer, and virus analysis are involved. When debugging a program, if you can obtain the source code, it will be easier to debug, that is, transparent white box testing. If there is no source code, that is, black box testing, you want to get the desired results, then we must have superb reverse technologies and reverse tools. Black box testing involves two scenarios: user mode and kernel mode. The two have different permissions.

The CPU register can quickly access a small amount of data. In the X86 instruction set, a CPU has eight General registers: EAX, EDX, ECX, ESI, EDI, EBP, ESP, EBX, and other registers are described one by one.
EAX: The accumulate register, which is used not only to store the return values of functions but also to perform computation operations. Many optimized X86 instruction sets have specially designed read/write and computation commands for EAX registers.

EDX: The data register, essentially an extension of the EAX register, assists the EAX register to complete more complex computing operations.
ECX: The counting register used for loop operations. The calculation is downward rather than upward, and the value is reduced from large to small.
ESI: Source Index, the Source operand pointer, stores the location of the input data stream for reading and efficient processing of cyclic operation data.
EDI: Destination Index, the Destination operand pointer, stores the location where the computation results are stored for writing and efficient processing of cyclic operation data.
ESP: Stack Pointer, a Stack Pointer, responsible for function calling and Stack operations. When a function is called, The Stack parameters and return addresses are pressed and directed to the top of the Stack, that is, the return address.
EBP: Base Pointer, a Base Pointer, which is responsible for function calling and stack operations. When a function is called, The stack parameters and return addresses are pressed and directed to the bottom of the stack.
EBX: The only register with no special purpose as an additional data storage.
EIP: Instruction Pointer, command Pointer, always pointing to the command to be executed immediately.

Anyone familiar with the debugger knows that a breakpoint is actually a debugging event, and other events, such as a classic Segment error (Segment Fault. Breakpoint includes software breakpoint, hardware breakpoint, and memory breakpoint, used to pause the program being executed.

Software breakpoint: A single-byte command that transfers control to the breakpoint handler of the debugger. Assembly commands are advanced representations of CPU-executed commands. The following Assembly commands mov eax and EBX tell the CPU to put the items stored in the EBX Register into the EAX register, however, the CPU does not understand this assembly instruction and must be converted to an operating code 8BC3 that can be recognized by the CPU. Assuming that this operation occurs at address 0x44332211, in order to set a breakpoint at this address and pause the CPU, you need to replace a single-byte operation code from the two-byte operation code 8BC3. This single-byte operation code is the interrupt command no. 3, INT3, and a command that can suspend the CPU, the operation code is 0xCC, as shown in the following code snippet. When the debugger is notified that a breakpoint is set for the target address, it first reads the operation code of the first byte of the target address, saves it, stores the address in the internal interrupt list, and then, the debugger writes the operation code 0xCC corresponding to the interrupt Command No. 3 to the address just now. When the CPU executes the replaced operation code, the CPU stops and triggers an INT3 event, at this time, the debugger can capture this event, and then the debugger uses the EIP to determine whether the interrupt address is a breakpoint we set. If yes, It writes the corresponding operation code back to restore the normal operation of the program. A software breakpoint includes a one-time breakpoint and a continuous breakpoint. The former takes effect once, and the latter takes effect all the time. If the former does not take effect, it is removed from the interrupt list. Note that when we change the memory data of the program to be debugged, we also change the CRC of the runtime software, that is, the Cyclic Redundancy Code checksum, CRC is a mechanism for verifying whether data is changed. It is widely used in files, memory, text, network data packets, and other places where you want to monitor data, it performs hash calculation on data within a certain range, and then compares the hash value with the previous hash value to determine whether the data has changed. In order to debug the program under such special circumstances, use the hardware breakpoint described below.

Address: operation code assembly instruction 0x44332211: 8BC3 mov eax, EBX0x44332211: CCC3 mov eax, EBX

Hardware breakpoint: Set the breakpoint in the small block area, which belongs to the CPU level and uses eight special debugging registers from DR0 to DR7. These registers are specially used to manage hardware breakpoints. DR0 to DR3 stores the hardware breakpoint address, which means that up to four hardware breakpoints can be created at the same time. DR4 and DR5 are retained, and DR6 is the Status Register, indicating the type of debugging event triggered by the breakpoint, DR7 is a switch register. It also stores different types of breakpoints, including interrupt during command execution, interrupt when data can be written, and interrupt when data is read or written but not executed. The hardware breakpoint uses the interrupt command INT1 on the first line to handle hardware interruptions and stepping events. The hardware breakpoint feature is that only four breakpoints can be set at a time, and the breakpoint takes effect only in four bytes. to track a large block of memory data, use the memory breakpoint described below.

Memory breakpoint: Used in a large area. Instead of a real breakpoint, it changes the permission of a block or page in the memory. A memory page is the smallest Memory Unit processed by the operating system. After a memory page is successfully applied, it has a permission set, such as executable pages, readable pages, and writable pages, these determine how the memory can be accessed. any access to the protected page will cause an exception, and the page will be restored before access.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.