Memory Structure in windows Process

Source: Internet
Author: User

Anyone who has been familiar with programming knows that advanced languages can use variable names to access data in the memory. How are these variables stored in the memory? How does a program use these variables? We will discuss this in depth below. If the C language code below does not have a special statement, the release version compiled by VC is used by default.
First, let's take a look at how C variables are distributed in the memory. The C language includes Global, Local, Static, and Regeister variables ). Each variable has a different allocation method. Let's take a look at the following code:
# Include
Int g1 = 0, g2 = 0, g3 = 0;
Int main ()
{
Static int s1 = 0, s2 = 0, s3 = 0;
Int v1 = 0, v2 = 0, v3 = 0;
// Print the memory address of each variable
Printf ("0x % 08x", & v1); // print the memory address of each local variable
Printf ("0x % 08x", & v2 );
Printf ("0x % 08x", & v3 );
Printf ("0x % 08x", & g1); // print the memory address of each global variable
Printf ("0x % 08x", & g2 );
Printf ("0x % 08x", & g3 );
Printf ("0x % 08x", & s1); // print the memory address of each static variable
Printf ("0x % 08x", & s2 );
Printf ("0x % 08x", & s3 );
Return 0;
}
The compiled execution result is:
0x0012ff78
0x0012ff7c
0x0012ff80
0x004068d0
0x004068d4
0x004068d8
0x004068dc
0x004068e0
0x004068e4
The output result is the memory address of the variable. V1, v2, v3 are local variables, g1, g2, g3 are global variables, s1, s2, and s3 are static variables. As you can see, these variables are continuously distributed in the memory, but the memory address allocated by the local and global variables is 108,000 different, while the memory allocated by the global and static variables is continuous. This is because local variables and global/static variables are allocated in different types of memory areas. The memory space of a process can be logically divided into three parts: Code zone, static data zone, and dynamic data zone. The Dynamic Data zone is generally a "stack ". "Stack" and "heap" are two different dynamic data zones. stack is a linear structure, and stack is a chain structure. Each thread of a process has a private "stack". Therefore, although the Code of each thread is the same, the data of local variables does not interfere with each other. A stack can be described through the "base address" and "Stack top" addresses. Global and static variables are distributed in the static data area, and local variables are distributed in the dynamic data area, that is, the stack. The program accesses local variables through the base address and offset of the stack.

Lower ------- lower-end memory area
│ ...... │
Certificate ------- Certificate
│ Dynamic Data zone │
Certificate ------- Certificate
│ ...... │
Certificate ------- Certificate
│ Code zone │
Certificate ------- Certificate
│ Static data zone │
Certificate ------- Certificate
│ ...... │
Middleware ------- memory high-end memory area

Stack is an advanced and post-release data structure. The top address of the stack is always less than or equal to the base address of the stack. We can first take a look at the function call process, so that we can have a deeper understanding of the role of the stack in the program. Different Languages have different function calling rules, and these factors have a balance between the parameter pressing rules and the stack. The Calling rules of windows APIs are different from those of ansi c. The former is called function to adjust the stack, and the latter is called function to adjust the stack. The two are distinguished by the prefix "_ stdcall" and "_ cdecl. Let's take a look at the following code:
# Include
Void _ stdcall func (int param1, int param2, int param3)
{
Int var1 = param1;
Int var2 = param2;
Int var3 = param3;
Printf ("0x % 08x", & para; m1); // print the memory address of each variable
Printf ("0x % 08x", & para; m2 );
Printf ("0x % 08x", & para; m3 );
Printf ("0x % 08x", & var1 );
Printf ("0x % 08x", & var2 );
Printf ("0x % 08x", & var3 );
Return;
}
Int main ()
{
Func (1, 2, 3 );
Return 0;
}
The compiled execution result is:
0x0012ff78
0x0012ff7c
0x0012ff80
0x0012ff68
0x0012ff6c
0x0012ff70

Upper ------- lower <-top stack (ESP) and low-end memory during function execution
│ ...... │
Certificate ------- Certificate
│ Var 1 │
Certificate ------- Certificate
│ Var 2 │
Certificate ------- Certificate
│ Var 3 │
Certificate ------- Certificate
│ RET │
Stack ------- stack <-"_ cdecl" function returns the top stack (ESP)
│ Parameter 1 │
Certificate ------- Certificate
│ Parameter 2 │
Certificate ------- Certificate
│ Parameter 3 │
Stack ------- stack <-"_ stdcall" function returns the top stack (ESP)
│ ...... │
Bottom ------- bottom <-stack (base address EBP), high-end memory area

This is what the stack looks like during the function call process. First, the three parameters are pushed into the stack in the order from the back to the left. First, press "param3", then "param2", and finally press "param1 "; press the return address (RET) of the function, jump to the function address, and execute the function. (here, we need to add a point. This article introduces the buffer overflow principle in UNIX, press the current EBP and use the current ESP to replace the EBP. However, there is an article about function calling in windows that says that function calling in windows also has this step, but according to my actual debugging, I did not find this step, this can also be seen from the four-byte gap between param3 and var1); Step 3: subtract a number from the top stack (ESP) to allocate memory space for local variables, in the above example, 12 bytes are subtracted (ESP = ESP-3 * 4, each int variable occupies 4 bytes); then the memory space of the local variable is initialized. Because the "_ stdcall" call is adjusted by the called function, the stack must be restored before the function is returned. The memory occupied by local variables (ESP = ESP + 3*4) must be recycled first ), then, extract the return address, fill in the EIP register, reclaim the memory occupied by the previously pushed parameters (ESP = ESP + 3*4), and continue executing the caller's code. See the following assembly code:
; -------------- Assembly code of the func function -------------------
: 00401000 83EC0C sub esp, 0000000C // create the memory space of the local variable
: 00401003 8B442410 mov eax, dword ptr [esp + 10]
: 00401007 8B4C2414 mov ecx, dword ptr [esp + 14]
: 0040100B 8B542418 mov edx, dword ptr [esp + 18]
: 0040100F 89442400 mov dword ptr [esp], eax
: 00401013 8D442410 lea eax, dword ptr [esp + 10]
: 00401017 894C2404 mov dword ptr [esp + 04], ecx
........................ (Omitted code)
: 00401075 83C43C add esp, 0000003C; restore the stack and reclaim the memory space of local variables
: 00401078 C3 ret 000C; function return, recover the memory space occupied by the Parameter
If it is "_ cdecl", here is "ret", the stack will be restored by the caller
; ----------------- Function end -------------------------

; -------------- Code for the main program to call the func function --------------
: 00401080 6A03 push 00000003 // push parameter param3
: 00401082 6A02 push 00000002 // push parameter param2
: 00401084 6A01 push 00000001 // The push parameter param1
: 00401086 E875FFFFFF call 00401000 // call the func Function
If it is "_ cdecl", the stack will be restored here, "add esp, 0000000C"
Smart readers can see the principle of buffer overflow. Let's take a look at the following code:
# Include
# Include
Void _ stdcall func ()
{
Char lpBuff [8] = "";
Strcat (lpBuff, "AAAAAAAAAAA ");
Return;
}
Int main ()
{
Func ();
Return 0;
}
After compilation, how about executing the code? Ha, "0x00414141" memory referenced by the "0x00000000" command. The memory cannot be "read ".", "Illegal operation! "41" is the hexadecimal ASCII code of "A", which is obviously A problem with strcat. "LpBuff" is only 8 bytes in size and is counted as ''at the end. Then, strcat can only write up to 7" ", but the program actually writes 11 "A" and 1 ''. Let's take a look at the figure above. The four extra bytes overwrite the memory space of RET. As a result, the function returns a wrong memory address and executes the wrong command. If you can carefully construct this string and divide it into three parts, the first part is only the meaningless data filled for overflow, followed by a data that overwrites RET, followed by a piece of shellcode, so long as a RET address can point to the first command of this shellcode, then the function can execute shellcode when returning. However, different software versions and different runtime environments may affect the location of the shellcode in the memory. Therefore, it is very difficult to construct this RET. Generally, a large number of NOP commands are filled between RET and shellcode, making exploit more universal.

Region ------- region <-low-end memory area
│ ...... │
Begin ------- begin <-start when data is filled in by exploit

│ Buffer │ <-enter useless data

Certificate ------- Certificate
│ RET │ <-points to the shellcode or NOP command range
Certificate ------- Certificate
│ NOP │
│ ...... │ <-The entered NOP command is the range that RET can point
│ NOP │
Certificate ------- Certificate

│ Shellcode │

Summary ------- summary <-end of Data filled in by exploit
│ ...... │
Middleware ------- memory <-high-end memory area

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.