Three methods for VC ++ to inject code to other processes

Source: Internet
Author: User
Tags amd processors
Three methods for VC ++ to inject code to other processes

Introduction:
On Code project (www.codeproject.com), we can find many password spyware programs, all of which rely on Windows Hook Technology. Is there any other way to implement this? Yes! But first, let's briefly review what we want to achieve so that you can figure out what I'm talking about.
To read the content of a control, whether or not it belongs to your own program, you generally need to send the wm_gettext message to that control. This is also valid for the edit control, but there is one exception. If the Edit Control belongs to another process and has the es_password style, this method will not succeed. Only processes with the "owns" password control can use wm_gettext to obtain its content. Therefore, our question is: how to make the following code run in the address space of other processes:
: Sendmessage (hpwdedit, wm_gettext, nmaxchars, psbuffer );

Generally, there are three possible solutions to this problem:
1. Put your code in a DLL, and then map it to a remote process with a Windows Hook.
2. Put your code in a DLL. Then, use createremotethread and loadlibrary to map it to a remote process.
3. Directly copy your code to a remote process (using writeprocessmemory) without using DLL and execute it with createremotethread. Here is a detailed description:

I. Windows Hook

Example programs: hookspy and hookinjex

Windows hooks are used to monitor the message flow of a thread. Generally, it can be divided:
1. Local hooks only monitor the message flow of a thread in your own process.
2. Remote hooks can be divided:
A. for a specific thread, monitor the messages of a thread in another process;
B. At the system level, monitor the messages of all threads running in the entire system.

If the thread to be linked (monitored) belongs to another process (Case 2a and 2b), your hook procedure (hook procedure) must be placed in a dynamic connection library (DLL. The system maps the DLL containing the hook process to the address space of the hook thread. Windows maps the entire DLL, not just your hook process. This is why Windows hooks can be used to inject code into the address space of other threads.

Here I don't want to discuss the hook issue in depth (please refer to the description of setwindowshookex in msdn). It may be useful for me to tell you what I can't find in the two documents:
1. When setmediawhookex is successfully called, the system will automatically map the DLL to the linked thread, but it is not immediately mapped. Because all windows hooks are message-based, the DLL will not be mapped until an appropriate event occurs. For example:
If you have installed a hook (wh_callwndproc) that monitors all unqueued (nonqueued) messages, the DLL is mapped only after a message is sent to the hook thread (a window in which it is located. That is to say, if unhookwindowshookex is called before the message is sent to the hook thread, the DLL will never be mapped to this thread (although setwindowshookex is successfully called ). To force ing, you can send an appropriate message to that thread immediately after calling setwindowshookex.

Similarly, after unhookwindowshookex is called, The DLL is truly detached from the hook thread only after a specific event occurs.

2. When you install hooks, the system performance will be affected (especially system-level hooks ). However, if you only use the hooks of a specific thread to map the DLL and do not intercept messages, this defect can be easily avoided. Look at the following code snippet of idea:
Bool apientry dllmain (handle hmodule,
DWORD ul_reason_for_call,
Lpvoid lpreserved)
{
If (ul_reason_for_call = dll_process_attach)
{
// Use loadlibrary to increase the number of references
Char lib_name [max_path];
: Getmodulefilename (hmodule, lib_name, max_path );
: Loadlibrary (lib_name );

// Safely uninstall the hook
: Unhookwindowshookex (g_hhook );
}
Return true;
}

Let's take a look. First, we map the DLL to a remote thread with a hook. Then, after the DLL is truly mapped in, we immediately unhook the hook ). Generally, when the first message reaches the hook thread, the DLL will be uninstalled. However, we can use loadlibrary to increase the number of references to this DLL to avoid the DLL being uninstalled.

The remaining question is: how to uninstall this DLL after use? Unhookwindowshookex does not work, because we have canceled the hook (unhook) for that thread. You can do this:
○ Install a hook before you want to uninstall the DLL;
○ Send a "special" message to a remote thread;
○ Intercept the message in the hook procedure process of your new hook and call freelibrary and unhookwindowshookex.
Currently, hooks are only used when the DLL is mapped to a remote process and when the DLL is detached from the remote process. They do not affect the performance of the hook thread. That is to say, we have found a DLL ing mechanism that can be used in both WINNT and Win9x (compared to the loadlibrary technology discussed in the second section) without affecting the performance of the target process.

But under what circumstances should we use this technique? Generally, this technique is applicable when the DLL needs to stay in a remote process for a long time (for example, if you want to subclass [subclass] A Control in another process) and you do not want to interfere with the target process too much. I didn't use it in hookspy, because the DLL is only injected for a short time-it is enough to get the password. I demonstrated this method in another example, hookinjex. Hookinjex: A dll “explorer.exe "(of course, it is finally uninstalled from it). the start button in the subclass is used. More specifically, I reversed the left and right mouse buttons of the start button.

You can find the download package links of hookspy, hookinjex, and their source code at the beginning of this article.

Ii. createremotethread and loadlibrary Technologies
Example: libspy
Generally, any process can dynamically load the DLL through loadlibrary. But how can we force an external process to call this function? The answer is createremotethread.
Let's take a look at the function declaration of loadlibrary and freelibrary:

Hinstance loadlibrary (
Lpctstr lplibfilename // address of filename of Library module
);

Bool freelibrary (
Hmodule hlibmodule // handle to loaded library module
);

Then compare it with thread procedure threadproc of createremotethread:
DWORD winapi threadproc (
Lpvoid lpparameter // thread data
);

You will find that all functions have the same calling convention (calling convention), all accept a 32-bit parameter, and the return value type is the same. In other words, we can pass the pointer of loadlibrary/freelibrary to crateremotethread as a parameter.

However, there are two other problems (refer to the description of createremotethread below)

1. The lpstartaddress parameter passed to threadproc must be the starting address of the thread process in the remote process.
2. if the lpparameter parameter of threadproc is treated as a common 32-bit integer (freelibrary regards it as a hmodule), there is no problem, however, if you use it as a pointer (loadlibrary treats it as a char *), it must point to the memory data in the remote process.

The first problem has been solved, because loadlibrary and freelibrary both exist in the functions of kernel32.dll, and Kernel32 can ensure that any "normal" process exists and its loading address is the same. (See Appendix A) loadlibrary/freelibrary has the same address in any process, which ensures that the pointer passed to the remote process is a valid pointer.

The second problem is also very simple: copy the DLL file name (the lodlibrary parameter) to the remote process using writeprocessmemory.

Therefore, the steps for using createremotethread and loadlibrary are as follows:
1. Get the handle of the Remote Process (using OpenProcess ).
2. allocate memory (virtualallocex) for the DLL file name in a remote process ).
3. Write the DLL file name (full path) to the allocated memory (writeprocessmemory)
4. Use createremotethread and loadlibrary to map your DLL to a remote process.
5. Wait for the remote thread to end (waitforsingleobject), that is, wait for loadlibrary to return. That is to say, when our dllmain (called with the dll_process_attach parameter) returns, the remote thread will immediately end.
6. Get the end code (getexitcodethtread) of the remote thread, that is, the return value of loadlibrary-the base address (hmodule) of our DLL load ).
7. Release the memory allocated in step 1 (virtualfreeex ).
8. Use createremotethread and freelibrary to detach the DLL from a remote process. When calling, pass the hmodule obtained in step 1 to freelibrary (using the lpparameter parameter of createremotethread ).
9. Wait for the end of the thread (waitsingleobject ).

At the same time, don't forget to close all the handles at the end: the thread handles obtained in steps 4th and 8, and the remote process handles obtained in step 1st.

Now let's take a look at some of the libspy code and analyze whether the above steps are implemented. For simplicity, There is no code that contains error processing and Unicode support.
Handle hthread;
Char szlibpath [_ max_path]; // file name of "libspy. dll"
// (Including full path !);
Void * plibremote; // szlibpath to be copied to the address
DWORD hlibmodule; // The base address of the loaded DLL (hmodule );
Hmodule hkernel32 =: getmodulehandle ("Kernel32 ");

// Initialize szlibpath
//...

// 1. allocate memory for szlibpath in a remote process
// 2. Write szlibpath to the allocated memory
Plibremote =: virtualallocex (hprocess, null, sizeof (szlibpath ),
Mem_commit, page_readwrite );
: Writeprocessmemory (hprocess, plibremote, (void *) szlibpath,
Sizeof (szlibpath), null );

// Load "libspy. dll" to a remote process
// (Through createremotethread & loadlibrary)
Hthread =: createremotethread (hprocess, null, 0,
(Lpthread_start_routine): getprocaddress (hkernel32,
"Loadlibrarya "),
Plibremote, 0, null );
: Waitforsingleobject (hthread, infinite );

// Obtain the base address of the DLL
: Getexitcodethread (hthread, & hlibmodule );

// Tail Scanning
: Closehandle (hthread );
: Virtualfreeex (hprocess, plibremote, sizeof (szlibpath), mem_release );

The code to be injected (for example, sendmessage) in dllmain has now been executed (due to dll_process_attach), so we can unmount the DLL from the target process.

// Detach libspu. dll from the target process
// (Through createremotethread & freelibrary)
Hthread =: createremotethread (hprocess, null, 0,
(Lpthread_start_routine): getprocaddress (hkernel32,
"Freelibrary "),
(Void *) hlibmodule, 0, null );
: Waitforsingleobject (hthread, infinite );

// Tail Scanning
: Closehandle (hthread );

Inter-process communication
So far, we have only discussed any DLL injection to remote processes. However, in most cases, the injected DLL needs to communicate with your program in some way (Remember, that DLL is mapped to a remote process, not in your local program !). Take password Spyware as an example: the DLL needs to know the handle of the control containing the password. Obviously, this handle cannot be hardcoded (hardcoded) during compilation. Similarly, when the DLL gets the password, it also needs to send the password back to our program.

Fortunately, there are many solutions to this problem: file ing, wm_copydata, and clipboard. There is also a very convenient method # pragma data_seg. I don't want to discuss it in depth here because they are well described in msdn (Take A Look at Interprocess Communications) or other materials. What I use in libspy is # pragma data_seg.

You can find the libspy and source code download links at the beginning of this article.

Iii. createremotethread and writeprocessmemory Technologies
Example program: winspy

Another way to inject code into other process address spaces is to use the writeprocessmemory API. Instead of writing an independent DLL, you can directly copy your code to a remote process (writeprocessmemory) and execute it with createremotethread.

Let's take a look at the createremotethread statement:
Handle createremotethread (
Handle hprocess, // handle to process to create thread in
Lpsecurity_attributes lpthreadattributes, // pointer to security
// Attributes
DWORD dwstacksize, // initial thread stack size, in bytes
Lpthread_start_routine lpstartaddress, // pointer to thread
// Function
Lpvoid lpparameter, // argument for new thread
DWORD dwcreationflags, // creation flags
Lpdword lpthreadid // pointer to returned thread identifier
);

It is different from createthread:

● The hprocess parameter is added. This is the handle of the process in which the thread is to be created.
● The lpstartaddress parameter of createremotethread must point to a function in the address space of the remote process. This function must exist in a remote process, so we cannot simply pass a local threadfucn address. We must copy the code to the remote process.
● Similarly, the data pointed to by the lpparameter parameter must also be stored in a remote process and copied.

Now, let's summarize the steps for using this technology:

1. Obtain the handle (OpenProcess) of the remote process ).
2. allocate memory (virtualallocex ),
3. Copy the initialized injdata structure to the allocated memory (writeprocessmemory ).
4. allocate memory (virtualallocex) for the data to be injected in the remote process ).
5. Copy threadfunc to the allocated memory (writeprocessmemory ).
6. Use createremotethread to start the remote threadfunc.
7. Wait for the end of the remote thread (waitforsingleobject ).
8. retrieving the execution result from a remote process (readprocessmemory or getexitcodethread ).
9. Release virtualfreeex ).
10. Close the open handle in steps 1 and 6th.

In addition, threadfunc must comply with the following rules:
1. threadfunc cannot call the API functions in the dynamic library except kernel32.dll and user32.dll. Only kernel32.dll and user32.dll (if loaded) can ensure the load address is the same locally and in the target process. (Note: USER32 is not necessarily loaded by all Win32 processes !) See Appendix. If you need to call functions in other libraries, use loadlibrary and getprocessaddress to force loading in the injected code. If, for some reason, the dynamic library you need has been mapped to the target process, you can also use getmoudlehandle instead of loadlibrary. Similarly, if you want to call your own functions in threadfunc, copy these functions to the remote process and provide the address to threadfunc through injdata.
2. Do not use static strings. Provide injdata for all strings. Why? The compiler places all static strings in the ". Data" segment of the executable file, and only keeps their references (pointers) in the code ). In this way, threadfunc in a remote process executes non-existent memory data (at least not in its own memory space ).
3. Remove the/GZ compilation option of the compiler. This option is default (see appendix B ).
4. Either declare threadfunc and afterthreadfunc as static, or close the "incremental linking" of the compiler (see appendix C ).
5. the total size of local variables in threadfunc must be smaller than 4 K bytes (see Appendix D ). Note: When degug is compiled, about 10 bytes in the 4 K will be occupied in advance.
6. If there are case statements with more than three switch branches, they must be split as follows or replaced by if-else if:

Switch (expression ){
Case constant1: statement1; goto end;
Case constant2: statement2; goto end;
Case constant3: statement2; goto end;
}
Switch (expression ){
Case constant4: statement4; goto end;
Case constant5: statement5; goto end;
Case constant6: statement6; goto end;
}
End:
(Refer to Appendix E)

If you do not follow these rules, you are doomed to cause the target process to crash! Remember, do not think that any data in the remote process will be stored in the same memory address as the data in your local process! (See Appendix F)
(Originally, you will almost certainly crash the target process if you don't play by those rules. just remember: Don't assume anything in the target process is at the same address as it is in your process .)

Getwindowtextremote (A/W)

All the work of obtaining the text in remote edit is encapsulated in this function: getwindowtextremote (A/W ):
Int getwindowtextremotea (handle hprocess, hwnd, lpstr lpstring );
Int getwindowtextremotew (handle hprocess, hwnd, lpwstr lpstring );

Parameters:
Hprocess
Process Handle of the target Edit
Hwnd
Destination edit handle
Lpstring
Buffer for receiving strings

Return Value:
Number of successfully copied characters.

Let's take a look at some of its code, especially the injected data and code. For simplicity, Unicode-supported code is not included.

Injdata

Typedef lresult (winapi * sendmessage) (hwnd, uint, wparam, lparam );

Typedef struct {
Hwnd; // handle to edit control
Sendmessage fnsendmessage; // pointer to USER32! Sendmessagea

Char pstext [128]; // buffer that is to receive the password
} Injdata;

Injdata is the data of the remote process to be injected. Before passing its address to sendmessagea, We need to initialize it first. Fortunately, unse32.dll is always mapped to the same address in all processes (if mapped), so the sendmessagea address is always the same, this ensures that the address passed to the remote process is valid.

Threadfunc

Static DWORD winapi threadfunc (injdata * pdata)
{
Pdata-> fnsendmessage (pdata-> hwnd, wm_gettext, // get the password
Sizeof (pdata-> pstext ),
(Lparam) pdata-> pstext );
Return 0;
}

// This function marks the memory address after threadfunc.
// Int cbcodesize = (pbyte) afterthreadfunc-(pbyte) threadfunc.
Static void afterthreadfunc (void)
{
}

Threadfunc is the Code actually executed by a remote thread.
● Note how afterthreadfunc calculates the code size of threadfunc. Generally, this is not the best method, because the compiler will change the code sequence in your function (for example, it will put threadfunc after afterthreadfunc ). However, you can at least be sure that in the same project, for example, in our winspy project, the order of your functions is fixed. If necessary, you can use the/order connection option, or use the Disassembly tool to determine the size of threadfunc, which may be better.

How to use this technology to subclass A Remote Control
Example: injectex

Let's discuss a more complex question: how to subclass A control belonging to another process?

First, to complete this task, you must copy two functions to the remote process:
1. threadfunc. This function subclass controls in remote processes by calling setwindowlong API,
2. newproc: window procedure ).

However, the main problem is how to transmit data to the remote newproc. Newproc is a callback function, which must comply with specific requirements ), we can no longer simply pass an injdata pointer as its parameter. Fortunately, I have found a solution to this problem, and there are two solutions, but they all need to use the assembly language. I have been trying to avoid using assembly, but this time, we can't escape, and we can't do without compilation.

Solution 1
Take a look at the following picture:

I wonder if you have noticed that injdata is placed next to newproc in front of newproc? In this way, newproc can know the injdata memory address during compilation. More precisely, it knows the relative offset between injdata and its own address, but this is not what we really want. Now newproc looks like this:
Static lresult callback newproc (
Hwnd, // handle to window
Uint umsg, // message identifier
Wparam, // first Message Parameter
Lparam) // second Message Parameter
{
Injdata * pdata = (injdata *) newproc; // point
// Newproc;
Pdata --; // now pdata points to injdata;
// Remember, injdata is located in a remote process
// Tighten the front of newproc;

//-----------------------------
// Subclass code
//........
//-----------------------------

// Call the window process;
// Fnoldproc (returned by setwindowlong) is initialized by threadfunc (in a remote process)
// And stored in injdata in a remote process;
Return pdata-> fncallwindowproc (pdata-> fnoldproc,
Hwnd, umsg, wparam, lparam );
}

However, there is another problem. See the First line:
Injdata * pdata = (injdata *) newproc;

Pdata is hard coded as the newproc address in our process, but this is not correct. Because newproc will be copied to a remote process, the address will be wrong.

There is no way to solve this problem with C/C ++. It can be solved with inline assembly. Check the modified newproc:

Static lresult callback newproc (
Hwnd, // handle to window
Uint umsg, // message identifier
Wparam, // first Message Parameter
Lparam) // second Message Parameter
{
// Calculate the injdata address;
// In a remote process, injdata is in
// Front of newproc;
Injdata * pdata;
_ ASM {
Call dummy
Dummy:
Pop ECx // <-ECx stores the current EIP
Sub ECx, 9 // The address where newproc is stored in <-ECx
MoV pdata, ECx
}
Pdata --;

//-----------------------------
// Subclass code
//........
//-----------------------------

// Call the original Window Process
Return pdata-> fncallwindowproc (pdata-> fnoldproc,
Hwnd, umsg, wparam, lparam );
}

What does this mean? Each process has a special register pointing to the memory address of the next instruction to be executed, which is a so-called EIP register on 32-bit intel and AMD processors. Because an EIP is a special register, you cannot access it like accessing a general-purpose register (such as eax and EBX. In other words, you cannot find an opcode that can be used to address an EIP and read/write it ). However, EIP can also be implicitly changed by commands such as JMP, call, and RET (in fact, it has been changing all the time ). Let's illustrate how call/RET works on 32-bit intel and AMD processors:

When we call a sub-program with call, the sub-program address is loaded into the EIP. At the same time, before an EIP is changed, its previous value will be automatically pressed to the stack (later used as the return instruction pointer [return instruction-pointer]). At the end of the subroutine, the RET command automatically pops up the value from the stack to the EIP.

Now we know how to modify the EIP value through call and RET, but how to get its current value?
Do you still remember that call has pushed the EIP value to the stack? Therefore, to obtain the EIP value, we call a "dummy function" and then bring up the top value of the stack. Take a look at the compiled newproc:

Address opcode/Params decoded instruction
--------------------------------------------------
: 00401000 55 push EBP; entry point
; Newproc
: 00401001 8bec mov EBP, ESP
: 00401003 51 push ECx
: 00401004 e800000000 call 00401009; * a * Call dummy
: 00401009 59 pop ECx; * B *
: 0040100a 83e909 sub ECx, 00000009; * C *
: 0040100d 894dfc mov [ebp-04], ECx; MoV pdata, ECx
: 00401010 8b45fc mov eax, [ebp-04]
: 00401013 83e814 sub eax, 00000014; pdata --;
.....
.....
: 0040102d 8be5 mov ESP, EBP
: 0040102f 5d pop EBP
: 00401030 c21000 RET 0010

A. A fake function call. Just jump to the next instruction and (more importantly, note) press the EIP stack.
B. The top stack value to ECx is displayed. The value of the EIP saved by ECx. This is the address of the "Pop ECx" command.
C. Note that the "distance" from the newproc entry point to the "Pop ECx" command is 9 bytes. Therefore, the newproc address is obtained after ECx is subtracted from 9.

In this way, newproc can always calculate its own address no matter where it is copied! However, note that the distance from the newproc entry point to "Pop ECx" may vary depending on your Compiler/link options, and it is also different in the release and degub versions. However, you can still know the specific value of this distance during compilation.
1. compile your function first.
2. Check the correct distance value in the disassembler.
3. Finally, recompile your program with the correct distance value.

This is also the solution used in injectex. Similar to hookinjex, injectex switches the start button to the left and right buttons to click events.

Solution 2

Putting injdata before newproc in a remote process is not the only solution. See the newproc below the example:
Static lresult callback newproc (
Hwnd, // handle to window
Uint umsg, // message identifier
Wparam, // first Message Parameter
Lparam) // second Message Parameter
{
Injdata * pdata = 0xa0b0c0d0; // a dummy Value

//-----------------------------
// Subclass code
//........
//-----------------------------

// Call the previous Window Process
Return pdata-> fncallwindowproc (pdata-> fnoldproc,
Hwnd, umsg, wparam, lparam );
}

Here, 0xa0b0c0d0 is just a placeholder (placeholder) for injdata addresses in remote processes ). You cannot get this value during compilation. However, you do know the injdata address after calling virtualallocex (when allocating memory for injdata! (Note: The returned value of virtualallocex)

Our newproc compilation looks like this:
Address opcode/Params decoded instruction
--------------------------------------------------
: 00401000 55 push EBP
: 00401001 8bec mov EBP, ESP
: 00401003 c745fcd0c0b0a0 mov [ebp-04], a0b0c0d0
: 0040100a...
....
....
: 0040102d 8be5 mov ESP, EBP
: 0040102f 5d pop EBP
: 00401030 c21000 RET 0010

The compiled machine code is 558becc745fcd0c0b0a0... 8be55dc21000.

Now, you do this:
1. Copy injdata, threadfunc, and newfunc to the target process.
2. Change the machine code of newpoc to point pdata to the real address of injdata.
For example, if the real address of injdata (the returned value of virtualallocex) is 0x008a0000, change the machine code of newproc:

558becc745fcd0c0b0a0 ...... 8be55dc21000 <-newproc 1 before modification
558becc745fc00008a00 ...... 8be55dc21000 <-modified newproc

That is to say, you should change the false value a0b0c0d0 to the actual address 2 of injdata.
3. start pointing to the remote threadfunc, which subclass the control in the remote process.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.