Use Win32 debug API to create your own debugger Debugger

Source: Internet
Author: User

Many of our friends dream of having their own debugger program. Today we will create one by ourselves. As a debugger program, the most basic functional framework is to complete two tasks:
Start the target program.
Monitors the running of the Target Program in real time and responds accordingly.
We need to build our own debugger program. In fact, we only need to complete these two functions. Of course, to complete these two specific functions, we cannot create a wheel from the ground up. First, we need to look at the infrastructure provided by the operating system:
Because we work on the Windows platform, we naturally cannot do without the Microsoft documentation-msdn. Open msdn and locate "debugging and error handling". Some basic Windows debug information is displayed here. However, compared with other topics, the information in this section is obviously much thinner-the more underlying and powerful the technology, the less Microsoft wants to publish it.
After preliminary browsing, we can determine that for our debugger, the most important debug APIs are as follows:
CreateProcess -- used to create a process to be debugged
Waitfordebugevent -- the main component function of DEBUG Loop
Continuedebugevent -- used to construct the debug Loop
Getthreadcontext -- get the register information of the debugged process
Setthreadcontext -- set register information of the debugged process
Readprocessmemory -- get the memory content of the debugged process
Writeprocessmemory -- set the memory content of the debugged process
The most important data structures are as follows:
Context -- register structure
Startupinfo -- start information
Process_information -- process-related information
Debug_event -- debug Event Structure
It can be said that our debugger program uses these API functions combined with the following data structures to complete our specified functions. Next let's take a look at the specific meanings of these APIs and data structures:

 

Debug API Parsing
Here, we will take a look at the debug debugging APIs described above, and introduce the application fields of each API. The application of these APIs to specific practices will provide detailed instructions in the next section "instance resolution.
1. CreateProcess.
Function prototype: bool CreateProcess (
Lptstr lpapplicationname, // name of the process module to be created
Lptstr lpcommandline, // command line string
Lpsecurity_attributes lpprocessattributes, // process Security Attribute
Lpsecurity_attributes lpthreadattributes, // thread Security Attribute
Bool binherithandles, // handle inheritance Option
DWORD dwcreationflags, // Process Creation Option
Lpvoid lpenvironment, // process environment block Data Pointer
Maid directory, // current directory name
Lpstartupinfo, // start information
Lpprocess_information lpprocessinformation // Process Information
);
Function parsing: this function is the most basic process creation function provided by the Windows platform. Every time we double-click an EXE executable file, the Windows Kernel will automatically call this function to create the process corresponding to the file we double-click. There are three most important parameters in this function: one is the Process Module name, which specifies the process to be created; the other is the process creation option, which specifies how to create the target process; for the debugger program, the most common creation option is: debug_process | debug_only_this_process. The last one is the process information. After we call CreateProcess to create a process, Windows will put all the information about the newly created process into the processinfo information block, in the debug loop debugging loop, we use the data in the process information block to interact with the target process to monitor and control the actions of the target process.
2. waitfordebugevent.
Function prototype: bool waitfordebugevent (
Lpdebug_event lpdebugevent, // debug event (debug event pointer)
DWORD dwmilliseconds // timeout settings
);
Function parsing: This function forms the main body of the debug loop debugging cycle. After a debugger program creates a target process, it usually calls the function cyclically to wait for various debugging information of the target process, this loop calls the waitfordebugevent process, which is called the debug loop debugging loop. The debugging cycle is the main part of all the debugger programs. Almost all the monitoring, control, and adjustment functions of the debugger are completed in the debugging cycle. In general, the timeout settings here are set to-1, that is, infinite waiting. This function is a non-blocking function. When no debug event occurs, it is waiting, only extremely small system resources are consumed.
3. continuedebugevent.
Function prototype: bool continuedebugevent (
DWORD dwprocessid, // target process ID
DWORD dwthreadid, // target thread ID
DWORD dwcontinuestatus // mark of thread continuation
);
Function parsing: this function is mainly used by the debugger to process the debug event in the debug loop and notify the target to continue running. Generally, both the target process ID and the target thread ID are the information contained in the processinfo structure after CreateProcess is called. This function uniquely identifies the target entry/thread through the target process/thread ID, and notifies the target entry/thread to continue running by setting different continuestatus. The main continuestatus has two options: one is dbg_continue, indicating that the debugging event has been processed by the debugger, and the target entry/thread can continue to run as usual; the other is dbg_exception_not_handled, this indicates that the debug event is not handled by the debugger. After the target process receives the flag, it will send the debug event along the Windows abnormal call chain. Until the debugging event is processed-of course, if the debug event sent by the target process does not have any debugger to handle, Windows will only sacrifice its own killer: the xxx exception of the application, will be closed soon.
3. getthreadcontext & setthreadcontext.
Function prototype: bool getthreadcontext (
Handle hthread, // target thread handle
Lpcontext // Context Structure
);
Bool setthreadcontext (
Handle hthread, // target thread handle
Const context * lpcontext // Context Structure
);
Function parsing: these two functions are used to obtain and set the register content of the target thread respectively. Note that in windows, the minimum unit granularity of Operating System Scheduling is thread rather than process. Therefore, in general, it is wrong to set the register content of a process, because a process may correspond to multiple threads. Therefore, when dealing with registers, you must specify which thread corresponds to the Register. In this function parameter, the target thread is specified by the hthread parameter, that is, the thread handle parameter. The source of this parameter is also the hthread member in processinfomation after CreateProcess is called. The context structure is defined based on the hardware platform of the machine. Windows operating systems have different context definitions on Intel, MIPS, Alpha, and PowerPC platforms. Each definition faithfully and completely reflects the register distribution of the target CPU. However, although the context structure has different forms of representation on different CPU platforms, the most basic intel X86 architecture has the same performance on each CPU. Therefore, as long as the debugger Code does not involve the specifc details of each CPU, it can still be used across CPU platforms.
4. readprocessmemory & writeprocessmemory.
Function prototype: bool readprocessmemory (
Handle hprocess, // Process Handle
Lpcvoid lpbaseaddress, // The base address of the memory to be read
Lpvoid lpbuffer, // data buffer pointer
Size_t nsize, // length of memory content to be read
Size_t * lpnumberofbytesread // the actual length of the READ memory content
);
Bool writeprocessmemory (
Handle hprocess, // Process Handle
Lpvoid lpbaseaddress, // The base address of the memory to be written.
Lpvoid lpbuffer, // data buffer pointer
Size_t nsize, // length of memory content to be written
Size_t * lpnumberofbyteswritten // actual length of written memory content
);
Function parsing: these two functions are used to read and write the memory address space of the target process. Unlike register operations, Windows allocates memory in units of processes. Since Intel 386 and later, all intel X86 series CPUs adopt the protection mode, so in the protection mode, windows virtualizes a "Virtual Machine" with 4 GB memory for every application, that is, every process. All threads belonging to the process share the 4 GB address space. Therefore, unlike the above register operations, when reading and writing memory operations, we need a process handle -- of course, this handle also comes from the processinfomation structure obtained after CreateProcess. With the process handle, we also need a base address and a length parameter to determine the memory range that our debugger program needs to read. Of course, the base address value here should correspond to the address in the 4 GB flat address space virtualized by Windows-that is, the address value after the segment selection and page selection process.

Debug_event Comprehensive Analysis
In the entire debugging cycle, the interaction between the debugger and the target process is completely through the debug_event structure parameter passed when waitfordebugevent is called. The definition of this structure seems simple at the beginning:
Typedef struct _ debug_event {
DWORD dwdebugeventcode;
DWORD dwprocessid;
DWORD dwthreadid;
Union {
Exception_debug_info exception;
Create_thread_debug_info createthread;
Create_process_debug_info createprocessinfo;
Exit_thread_debug_info exitthread;
Exit_process_debug_info exitprocess;
Load_dll_debug_info loaddll;
Unload_dll_debug_info unloaddll;
Output_debug_string_info debugstring;
Rip_info ripinfo;
} U;
} Debug_event, * lpdebug_event;
However, three common information data are added to a union Union domain. The functions implemented by this structure are not simple. Let's make a simple thought. As a debugger, we should at least be able to receive:
Start the target process
The target process has an exception.
The target process exits.
The three most basic debugging information. In addition, the information types corresponding to each information should be different, such:
When starting a target process, we need the module name and permission settings of the target process.
When the target process is abnormal, we should be able to know the abnormal address, cause of the exception (which is classified as many), and severity of the exception
When the target process exits, we should be able to know the exit value of the process to determine whether the process Exits normally.
The above list is just some of the most basic elements. to form the Data Interaction layer of the entire windows debug API, the situation is much more complicated. It is difficult to express all the complicated information in one structure.
Microsoft chose to identify the most basic information through dwdebugeventcode, and then package all the information through the Union union Union domain. The advantage of this approach is that when waitfordebugevent is called, you only need to pass a unified structure parameter. The disadvantage is that the internal information of the debug_event structure is very complicated, which brings a lot of trouble to program design.
When using the debug_event structure, we first need to get the value of dwdebugeventcode to determine what is in the Union domain. The relationship between the two is shown in the following table:

Value of dwdebugeventcode
Debugging information of union Union domain types
Prediction_debug_event prediction_debug_info
Application exception
Create_thread_debug_event create_thread_debug_info
Thread Creation
Create_process_debug_event create_process_debug_info
Process Creation
Exit_thread_debug_event exit_thread_debug_info
Thread exited
Exit_process_debug_event exit_process_debug_info
Process exited
Load_dll_debug_event load_dll_debug_info
DLL Loading
Unload_dll_debug_event unload_dll_debug_info
DLL uninstall
Output_debug_string_event output_debug_string_info
Output debug string
Rip_event rip_info system debugging Error
The above table is only the first layer of data parsing in the debug_event structure. When the Union domain gets different values, the data structures on the second layer are not much simpler than those on the second layer. However, most of the time, we only need to parse the debug_event structure to the second layer.
In general, the debugeventcode we are most interested in is prediction_debug_event. Unfortunately, the prediction_debug_info structure corresponding to prediction_debug_event is also the most complex of all Union domain structures. Therefore, it is necessary to further describe the level-2 Data Structure of prediction_debug_info here:
Typedef struct _ exception_debug_info {
Prediction_record predictionrecord;
DWORD dwfirstchance;
} Prediction_debug_info, * lpexception_debug_info;
Dwfirstchange: If it is 0, it indicates that the exception has not been handled before, that is, our debugger program is in the header of the Windows exception handling chain.
Predictionrecord: The information in this structure is the actual storage location of the prediction_debug_info structure. Therefore, continue to analyze the structure.
Typedef struct _ exception_record {
DWORD exceptioncode;
DWORD exceptionflags;
Struct _ prediction_record * predictionrecord;
Pvoid exceptionaddress;
DWORD numberparameters;
Ulong_ptr predictioninformation [prediction_maximum_parameters];
} Prediction_record, * pexception_record;
First, you should note that this structure contains a data member of the exception_record * type. This is a typical method of concatenating data in C language: connecting each data member with a pointer field, it can constitute a classic "linked list" data structure (chain is called in English ).
Secondly, the exceptioncode member identifies the exception_record type represented by this structure. In Windows, the following 20 abnormal behaviors are defined:

Value Meaning
Exception_access_violation access out of bounds
Prediction_array_bounds_exceeded array access that is monitored by hardware out of bounds
Prediction_breakpoint triggers a breakpoint
Prediction_datatype_misalignment data is not aligned
Prediction_flt_denormal_operand floating point operand range out of bounds (too small or too large)
Prediction_flt_divide_by_zero floating point operation divisor is 0
Prediction_flt_inexact_result: The floating point operation result cannot be expressed as a decimal.
Prediction_flt_invalid_operation other unknown floating point number errors
Prediction_flt_overflow floating point operation is too large to overflow
Prediction_flt_stack_check floating point stack overflow.
Prediction_flt_underflow floating point operation is too small to overflow
Exception_illegal_instruction illegal Command Execution
Prediction_in_page_error
Prediction_int_divide_by_zero Division: 0
Prediction_int_overflow Integer Operation, maximum Overflow
Exception_invalid_disposition error exception handler address
Exception_noncontinuable_exception: The execution cannot be continued.
Prediction_priv_instruction: The command cannot be executed in current mode.
Prediction_single_step: One-Step tracking breakpoint triggering
Exception_stack_overflow the stack space of the thread overflows.
Generally, the most common exceptions handled by our debugger program are prediction_breakpoint and prediction_single_step. For example, if we need to perform some operations on the target process when the target process runs at the address 0x00400000, we only need to try to make the target program run at the address 0x00400000, send an exception signal to our debugger program, so that our debugger program can receive this signal, then, use the set/getthreadcontext and read/writeprocessmemory functions described earlier to control the target process.

A preliminary debugger framework
Now that the above content is enough for us to write the most basic debugger, we will first write the most basic debugger program, the function completed by this program is very simple: Use context_full | context_debug_registers as a flag to create a process and let the target process operate as usual.
The code that first jumps into our mind is as follows:
: CreateProcess (_ T ("msg.exe"), null, debug_process | debug_only_this_process,
Null, null, & Sif, & PI );
// Below is the well-known debug framework!
Do
{
: Waitfordebugevent (& dbevent, infinite );
Dwstate = dbg_exception_not_handled;
Case exit_process_debug_event:
{
Stop = true;
Break;
}
If (! Stop)
{
: Continuedebugevent (PI. dwprocessid, Pi. dwthreadid, dwstate );
}
} While (! Stop );
Compile and double-click. Unfortunately! Windows gave us
Why is the above error? The Microsoft documentation does not provide a detailed description of how the debugger should work. Through the tracing program, we find that every time msg.exe starts the target application, it will send a prediction_breakpoint breakpoint signal to our debugger program, and our debug loop does not process the signal. Now we add the processing process for this signal, so that our debugger does not return dbg_exception_not_handled, but returns dbg_continue. specifically in the code, it is in:
Case exit_process_debug_event:
Before this sentence, add:
Case exception_debug_event:
{
Switch (dbevent. U. Exception. exceptionrecord. exceptioncode)
{
Case exception_breakpoint:
{
Dwstate = dbg_continue;
Break;
}
}
Break;
}
Compile, link, and test run. Everything is OK. Msg.exe normally pops up MessageBox,

From C to C ++
From the simplest debugging framework code above, it is not difficult to see that every time we add an exception type for processing and judgment, we need to add new routes to the increasingly complex switch-case code, program code can easily become bloated and difficult to maintain. In fact, the working skeleton of the entire code has not changed substantially, the only difference is that we need different sub-processes for different exception situations-This reminds us of the template design pattern-we can put the code for decomposing debug_event In the debug loop in the Framework Code, then, call different hook virtual functions in the Framework Code. In this way, when we want to extend our own debugger function, we only need to inherit from the existing debug_base class and rewrite the hook virtual method.
The debug_base.h class is included with the debug_base class, which implements the debugger template as described above. The simplest use of this class is as follows:
Debugger: debug_base debugger;
Debugger. run_debug_loop (STD: tstring (text ("msg.exe ")));
The above two pieces of code implement the function completed by the simplest debugger mentioned in the previous section-load the program to be debugged and run it as usual.
Debug_base provides the following hook functions:
Hook Function Name Function
Handle_first_exception processing subprocess when Windows Kernel sends prediction_breakpoint for the first time
Handle_exception_breakpoint
Handle_single_step
Handle_process_create Process
Handle_process_exit process exit sub-process
Handle_thread_create thread creation processing sub-process
Handle_thread_exit thread releases the processing sub-process
Handle_dll_load DLL is called when it is loaded
Handle_dll_unload DLL is called when it is uninstalled.
Handle_debug_wstring is called by the debug program outputdebugstring.
In the actual programming process, we only need to derive our new class from debug_base, and rewrite the required virtual function to compile our own debugger!
For example, we want our debugger to have the following functions:
Interrupted at the program entry point, prompting that the entry point has arrived
The debug string of the program to be debugged is intercepted and output.
Then we can design our new class in this way:
Class mydebugger: Public Debugger: debug_base
{
Virtual void handle_first_exception (const process_information & PI)
{
: MessageBox (null, text ("first interruption"), text ("Debugger worked"), null );
Return;
}

Virtual void handle_debug_wstring (STD: wstring & debug_wstring)
{
: MessageBox (null, ATL: cw2t (debug_wstring.c_str (), text ("Debug string"), null );
Return;
}
};
The main program does not need to make any changes. After the debugger runs, it can accept the debugging information of the program to be debugged and output the following information:

Conclusion
With the help of the debug API, Windows provides a fully functional ring 3 debugging platform. However, due to the lack of documentation and sample code, the application of debug api is still very narrow so far. In fact, if you can really grasp the essence of dynamic debugging, using the debug API to create a proprietary debugger to debug some very special programs can receive very good results. It can be said that any shelling machine or memory registration machine is only a specific application of debug api.
Of course, Due to space limitations, this article cannot explore the world of Windows Debugger in more detail, our debug_base class is still very preliminary-it does not even have the function of adding debugging breakpoints when the program is running, but as long as we understand the working principle of the debugger, combined with some underlying programming materials from Microsoft and Intel, various functions can be added and improved slowly. I hope that readers will pay attention to the accumulation, sorting, and digestion of basic knowledge in the process of learning reverse engineering, and constantly improve their understanding of computer systems so that they can finally become a master, create a soft-ice and ollydbg for Made in China!

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.