Memory structure in Windows process (many APIs, and VC is the smartest)

Source: Internet
Author: User

Before reading this article, if you don't even know what the stack is, read the basics behind the article first.

People who have been exposed to programming know that high-level languages can access data in memory through variable names. So how are these variables stored in memory? How does the program use these variables? The following is an in-depth discussion of this. The C language code below, if not specifically stated, uses the release version compiled by VC by default.

First, let's look at how C-language variables are in memory division. The C language has global variables (globals), local variables (locals), static variables (static), and register variables (regeister). Each variable has a different allocation method. First look at the following code:

#include <stdio.h>

int g1=0, g2=0, g3=0;

int main ()
{
static int s1=0, s2=0, s3=0;
int v1=0, v2=0, v3=0;

Print out the memory address of each variable

printf ("0x%08x\n", &AMP;V1); Print memory addresses for each local variable
printf ("0x%08x\n", &v2);
printf ("0x%08x\n\n", &v3);
printf ("0x%08x\n", &AMP;G1); Print memory addresses for each global variable
printf ("0x%08x\n", &AMP;G2);
printf ("0x%08x\n\n", &g3);
printf ("0x%08x\n", &AMP;S1); Print memory addresses for each static variable
printf ("0x%08x\n", &AMP;S2);
printf ("0x%08x\n\n", &AMP;S3);
return 0;
}

The results of the compiled execution are:

0x0012ff78
0x0012ff7c
0x0012ff80

0x004068d0
0x004068d4
0x004068d8

0x004068dc
0x004068e0
0x004068e4

The result of the output is the memory address of the variable. Where V1,v2,v3 is a local variable, G1,G2,G3 is a global variable, and S1,S2,S3 is a static variable. You can see that these variables are continuously distributed in memory, but local and global variables are allocated 108,000 of memory addresses, while the memory allocated by global variables and static variables is contiguous. This is because local variables and global/static variables are the result of allocations in different types of memory regions. For the memory space of a process, it can be logically divided into 3 parts: code area, static data area and dynamic Data area. Dynamic Data areas are generally "stacks". Stack and heap are two different dynamic data areas, the stack is a linear structure, and the heap is a chain structure. Each thread of the process has a private "stack", so each thread, though the code, is not interfering with the data of the local variable. A stack can be described by "base address" and "stack Top" addresses. Global variables and static variables are allocated in the static data area, and local variables are allocated in the Dynamic Data area, which is the stack. The program accesses local variables through the base address and offset of the stack.


├ ——————— ┤ Low-end memory area
│ ... │
├ ——————— ┤
│ Dynamic Data Area │
├ ——————— ┤
│ ... │
├ ——————— ┤
│ Code Area │
├ ——————— ┤
│ Static data area │
├ ——————— ┤
│ ... │
├ ——————— ┤ High-end memory area


Stack is an advanced data structure, the stack top address is always less than the base address of the stack. We can look at the procedure of a function call in order to have a deeper understanding of the role of the stack in the program. Different languages have different function invocation rules, these factors have parameters of the indentation rules and stack balance. The calling rules of the Windows API are different from the ANSI C function call rules, which are adjusted by the tuned function stack, which is adjusted by the caller. The two are differentiated by the "__stdcall" and "__cdecl" prefixes. Let's look at the following code:

#include <stdio.h>

void __stdcall func (int param1,int param2,int param3)
{
int var1=param1;
int var2=param2;
int var3=param3;
printf ("0x%08x\n", ¶m1); Print out the memory address of each variable
printf ("0x%08x\n", ¶m2);
printf ("0x%08x\n\n", ¶m3);
printf ("0x%08x\n", &var1);
printf ("0x%08x\n", &var2);
printf ("0x%08x\n\n", &AMP;VAR3);
Return
}

int main ()
{
Func (n/a);
return 0;
}

The results of the compiled execution are:

0x0012ff78
0x0012ff7c
0x0012ff80

0x0012ff68
0x0012ff6c
0x0012ff70



├ ——————— ┤<-function at the top of the stack (ESP), low-end memory area
│ ... │
├ ——————— ┤
│var 1│
├ ——————— ┤
│var 2│
├ ——————— ┤
│var 3│
├ ——————— ┤
│ret│
├ ——————— ┤<-"__cdecl" function returns the top of the stack (ESP)
│parameter 1│
├ ——————— ┤
│parameter 2│
├ ——————— ┤
│parameter 3│
├ ——————— ┤<-"__stdcall" function returns the top of the stack (ESP)
│ ... │
├ ——————— ┤<-Bottom (base address EBP), high-end memory area


Is what the stack looks like in the process of the function call. First, three parameters are pressed from the left to the stack, the first pressure "param3", then Pressure "param2", and then press "param1", and then press into the function return address (RET), and then jump to the function address followed by execution (here to add, The article describing the principle of buffer overflow under Unix mentions that after the RET is pressed in, it continues to press into the current EBP and replaces the EBP with the current ESP. However, there is an article that describes a function call under Windows that also has this step, but according to my actual debugging, this step is not found, which can also be seen from the 4-byte gap between Param3 and var1; the third step, the top of the stack (ESP) Subtract one number, allocating memory space for the local variable, minus 12 bytes (esp=esp-3*4, 4 bytes per int variable) in the example above, and then initializing the memory space of the local variable. Since the "__stdcall" call is tuned by the function to adjust the stack, the stack is recovered before the function returns, the memory (ESP=ESP+3*4) occupied by the local variable is reclaimed, then the return address is fetched, the EIP register is reclaimed, and the memory (ESP=ESP+3*4) of the previous press-in parameter is recycled. Continue executing the caller's code. See the following assembly codes:

;--------------the assembly code of the Func function-------------------

: 00401000 83ec0c Sub ESP, 0000000C//Create a local variable memory space
: 00401003 8b442410 mov eax, DWORD ptr [ESP+10]
: 00401007 8b4c2414 mov ecx, DWORD ptr [ESP+14]
: 0040100B 8b542418 mov edx, DWORD ptr [esp+18]
: 0040100F 89442400 mov dword ptr [ESP], eax
: 00401013 8d442410 Lea eax, DWORD ptr [ESP+10]
: 00401017 894c2404 mov dword ptr [esp+04], ECX

........................ (Omit some code)

: 00401075 83c43c add ESP, 0000003C; recovery stack, reclaim memory space for local variables
: 00401078 C3 ret 000C; function return, memory space occupied by recovery parameters
If it is "__cdecl", this is "ret" and the stack will be restored by the caller.

;-------------------function Ends-------------------------


;--------------the code of the main program to invoke the Func function--------------

: 00401080 6a03 Push 00000003//press-in Parameter param3
: 00401082 6a02 Push 00000002//press-in Parameter param2
: 00401084 6A01 Push 00000001//press-in Parameter param1
: 00401086 e875ffffff Call 00401000//Invoke Func function
If it is "__cdecl", the stack will be restored here, "add ESP, 0000000C"

Smart readers see here, almost understand the principle of buffer overflow. First look at the following code:

#include <stdio.h>
#include <string.h>

void __stdcall func ()
{
Char lpbuff[8]= "n";
strcat (Lpbuff, "aaaaaaaaaaa");
Return
}

int main ()
{
Func ();
return 0;
}

What happens after compiling? Ha, "0x00414141" instruction refers to "0x00000000" memory. The memory cannot be "read". "," Illegal operation "! "41" is the "a" of the 16 ASCII code, it is clearly strcat this sentence out of the question. The size of "Lpbuff" is only 8 bytes, the end of the "" ", the strcat can only write up to 7" a ", but the program actually wrote 11" a "plus 1 '. Take a look at the above image, the extra 4 bytes just cover the location of the RET's memory space, causing the function to return to an incorrect memory address, executed the wrong instruction. If you can carefully construct this string, so that it is divided into three parts, the previous part is simply filled with meaningless data to achieve the purpose of overflow, followed by a data covering RET, followed by a section of Shellcode, that as long as a RET address can point to this paragraph shellcode the first instruction, When the function returns, it can execute the shellcode. However, different versions of the software and different operating environments can affect the location of the shellcode in memory, so it is very difficult to construct this ret. Generally in the RET and shellcode filled with a large number of NOP instructions, so that the exploit has a stronger versatility.


├ ——————— ┤<-Low-end memory area
│ ... │
├ ——————— ┤<-The beginning of data filled by exploit
││
│buffer│<-fill in useless data
││
├ ——————— ┤
│ret│<-point to Shellcode, or the range of NOP instructions
├ ——————— ┤
│nop│
│ ... │<--filled NOP instruction, which is the range that RET can point to
│nop│
├ ——————— ┤
││
│shellcode│
││
├ ——————— ┤<-by exploit fill in the end of the data
│ ... │
├ ——————— ┤<-High-end memory area


Dynamic Data under Windows can be stored in the heap, in addition to the stack. Knowing C + + friends know that C + + can use the New keyword to dynamically allocate memory. Look at the following C + + code:

#include <stdio.h>
#include
#include <windows.h>

void Func ()
{
Char *buffer=new char[128];
Char bufflocal[128];
static Char buffstatic[128];
printf ("0x%08x\n", buffer); Print the memory address of a variable in the heap
printf ("0x%08x\n", bufflocal); Print the memory address of a local variable
printf ("0x%08x\n", buffstatic); Print the memory address of a static variable
}

void Main ()
{
Func ();
Return
}

The results of the program execution are:

0x004107d0
0x0012ff04
0x004068c0

You can see that the memory allocated with the New keyword is not in the stack or in the static data area. The VC compiler implements the memory dynamic allocation of the new keyword through the heap under Windows. Before you talk about "heap," Let's look at several API functions related to "heap":

HeapAlloc requesting memory space in the heap
HeapCreate creating a new Heap object
HeapDestroy destroying a Heap object
HeapFree releasing the requested memory
HeapWalk all memory blocks of an enumeration heap object
GetProcessHeap Gets the default heap object for the process
Getprocessheaps get all the heap objects of the process
LocalAlloc
GlobalAlloc

When the process initializes, a default heap is automatically created for the process, and the heap defaults to a memory size of 1M. The heap object is managed by the system, and it exists in the memory as a chained structure. The following code allows you to request memory space dynamically from the heap:

HANDLE hheap=getprocessheap ();
Char *buff=heapalloc (hheap,0,8);

Where Hheap is a handle to the heap object, the buff is the address that points to the requested memory space. What exactly is this hheap? What's the point of its value? Let's look at the following code:

#pragma COMMENT (linker, "/entry:main")//define the entry of the program
#include <windows.h>

_crtimp Int (__cdecl *printf) (const char *, ...); Defining STL Functions printf
/*---------------------------------------------------------------------------
Writing here, we'll drop by to review the previous knowledge:
(* note) The printf function is the standard function library in C, and the standard library of VC is implemented by the Msvcrt.dll module.
The function definition is visible, the number of printf parameters is variable, the function can not know in advance the number of arguments pressed by the caller, the function can only be analyzed by the first parameter string format to obtain the information of the indentation parameter, because the number of arguments here is dynamic, so the caller must balance the stack, the use of _ _cdecl call rules. The API function of the btw,windows system is basically the __stdcall invocation form, with only one API exception, which is wsprintf, which uses the __cdecl call rule, as with the printf function, because its number of arguments is variable.
---------------------------------------------------------------------------*/
void Main ()
{
HANDLE hheap=getprocessheap ();
Char *buff=heapalloc (hheap,0,0x10);
Char *buff2=heapalloc (hheap,0,0x10);
Hmodule hmsvcrt=loadlibrary ("Msvcrt.dll");
printf= (void *) GetProcAddress (HMSVCRT, "printf");
printf ("0x%08x\n", hheap);
printf ("0x%08x\n", buff);
printf ("0x%08x\n\n", BUFF2);
}

The result of the execution is:

0x00130000
0x00133100
0x00133118

How is the value of hheap so close to the value of the buff? In fact hheap this handle is the address that points to the heap header. There is a structure in the user area of the process called PEB (Process environment block), which holds some important information about the process, where processheap is the address of the process default heap at the PEB first address offset 0x18. A pointer to the address list of all the processes in the process is stored at offset 0x90. Windows has many APIs that use the process's default heap to hold dynamic data, such as all ANSI versions of functions under Windows 2000 that request memory in the default heap to convert ANSI strings to Unicode strings. Access to a heap is sequential, and only one thread can access the data in the heap at the same time, and when multiple threads have access requirements at the same time, they can only wait in a queue, resulting in inefficient program execution.

In the end, the data in memory is aligned. The data alignment means that the memory address of the data must be an integer multiple of the data length, the memory start address of the DWORD data can be removed by 4, the memory start address of the word data can be removed by 2, the x86 CPU can directly access the aligned data, when he tries to access an unaligned data, A series of adjustments are made internally, and these adjustments are transparent to the program, but slow down, so the compiler will try to keep the data aligned as much as possible when compiling the program. For the same piece of code, let's take a look at the execution results of a program compiled with VC, dev-c++, and LCC three different compilers:

#include <stdio.h>

int main ()
{
int A;
Char b;
int C;
printf ("0x%08x\n", &a);
printf ("0x%08x\n", &b);
printf ("0x%08x\n", &c);
return 0;
}

This is the result of compiling with VC:
0x0012ff7c
0x0012ff7b
0x0012ff80
The order of variables in memory: B (1 bytes)-A (4 bytes)-C (4 bytes).

This is the result of the compiled execution with dev-c++:
0x0022ff7c
0x0022ff7b
0x0022ff74
The order of variables in memory: C (4 bytes)-Between 3 bytes-B (1 bytes)-A (4 bytes).

This is the result of the implementation of the LCC compilation:
0x0012ff6c
0x0012ff6b
0x0012ff64
The order of the variables in memory: Ibid.

Three compilers have done the data alignment, but the latter two compilers apparently did not VC "smart", let a char accounted for 4 bytes, wasted memory oh.


Basic knowledge:
A stack is a simple data structure, a linear table that is only allowed to be inserted or deleted at one end. The one end of the Allow insert or delete operation is called the top of the stack, and the other end is called the bottom, and the insert and delete operations on the stack are called into the stack and out of the stack. There is a set of CPU instructions that can implement stack access to the memory of the process. wherein, the pop instruction implements the stack operation, and the push instruction implements the stack operation. The ESP register of the CPU holds the stack-top pointer of the current thread, and the EBP register holds the stack-bottom pointer of the current thread. The EIP register of the CPU holds the memory address of the next CPU instruction, when the CPU executes the current instruction, reads the memory address of the next instruction from the EIP register, and then proceeds to execute.


Reference: "Heap overflow and its utilization under Windows" By:isno
"Windows core Programming" By:jeffrey Richter

Http://www.cnblogs.com/qiubole/archive/2008/03/07/1094765.html

Memory structure in Windows process (many APIs, and VC is the smartest)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.