[Reprint]-programmer's understanding of memory

Source: Internet
Author: User
Programmer's understanding of Memory Original article link

In C and C ++ language development, pointers and memory have always been the focus of learning. C language, as a low-and middle-level language, provides a large amount of memory for direct operations. This maximizes the flexibility of the program and also paves the way for bugs.

Therefore, we must have a clear understanding of memory in any case.

1. Internal allocation

The 32-bit operating system supports 4 GB of memory for continuous access, but usually divides the memory into two 2 GB space, each process can use up to 2 GB of private memory (0x00000000-0x7fffffff) at runtime ). Theoretically, the following large arrays are supported:

char szBuffer[2*1024*1024*1024];

Of course, in actual operation, the program also has code segments, temporary variable segments, and dynamic memory applications. In fact, it is impossible to use the above large array.

For the high-end 2 GB memory address (0x80000000-0xffffff), the operating system is generally used internally, that is, for the operating system kernel code. On Windows and Linux platforms, some dynamic link libraries (Windows DLL, Linux so) and OCX controls are provided because they are cross-process services, therefore, it generally runs in a high 2 GB memory.

We can see that each process can see its own 2 GB memory and 2 GB memory of the system, but different processes cannot see each other. Of course, the operating system has done a lot of work at the underlying layer, such as virtual memory swap on the disk (refer to the title below), dynamic ing of different memory blocks, and so on.

Ii. Virtual Memory

The basic idea of virtual memory is to use a cheap but slow disk to expand the fast but expensive memory. At a certain point in time, the content of the virtual memory segment actually required by the program is loaded into the physical memory. When data in the physical memory is not used for a period of time, it may be transferred to the hard disk. The saved physical memory space is used to load other data.

During the process execution, the operating system is responsible for the specific details, so that each process has exclusive access to the entire address space. This illusion is achieved through "Virtual Memory. All processes share the physical memory of the machine. When the memory is used up, data is stored on the disk. During the process, data is moved back and forth between the disk and memory. The memory management hardware is responsible for translating virtual addresses into physical addresses and allowing a process to always run in the real memory of the system. application programmers can only see virtual addresses, I do not know that my process switches back and forth between disk and memory.

Potentially, all memory related to the process will be used by the system. If the process may not run immediately (it may have a low priority, it may also be in sleep state). The operating system can temporarily retrieve all the physical memory resources allocated to it and back up all related information of the process to the disk.

A process can only operate on pages in the physical memory. When a process references a page that is not in the physical memory, MMU generates a page error. The memory responds to this issue and determines whether the reference is valid. If it is invalid, the kernel sends a "segmentation violation (segment violation)" signal to the process. The kernel retrieves the page from the disk and switches it to the memory. Once the page enters the memory, the process is unlocked, you can run it again. The process itself does not know that it had waited for a while because of the page switching event.

Iii. memory usage

For programmers, the most important thing is to understand the meaning of private memory space between different processes. The C and C ++ compilers divide private memory into three parts: the base stack, floating stack, and heap. For example:

(1) Base Stack: Also called static storage zone, which is the memory that the compiler has fixed during compilation and must use, such as the code segment, static variable, global variable, and const constant of the program.

(2) floating Stack: many books refer to as "Stacks", that is, the program starts to run. As a function or an object is executed, the internal variables of the function and the internal member variables of the object start to dynamically occupy the memory, A floating stack generally has a lifecycle. When a function ends or an object is destructed, the corresponding floating stack space is removed. This part of content is always changed, memory usage is not fixed, so it is called a floating stack.

(3) Heap: Both C and C ++ languages support dynamic memory application, that is, they can freely apply for memory during the running period. The heap is located at the top of 2 GB and allocated from top to bottom. This avoids mixing with the floating stack and is difficult to manage. We use both malloc and new to apply for memory from the heap space. New has more object support than malloc and can automatically call constructor. In addition, new creates an object whose member variables are located in the heap.

Let's look at an example:

const int n = 100;void Func(void){    char ch = 0;    char* pBuff = (char*)malloc(10);    //…}

If this function is run, where n is a global static variable located in the base stack, CH, and pbuff, CH is located in the floating stack, the memory allocated by malloc pointed by pbuff is located in the stack.

In terms of memory understanding, the most famous example is the parameter transfer during thread startup.

When a function starts a thread, it usually needs to pass parameters to the thread, but the thread starts asynchronously, that is, it is very likely that the startup function has exited, And the thread function has not yet officially started to run, therefore, the internal variables of the startup function cannot be used to pass parameters to the thread. The principle is very simple. The internal variables of the function are on the floating stack, but when the function exits, the floating stack is automatically removed, and the memory space has been released. When the thread starts, the variable is queried according to the given parameter pointer. In fact, the program crashes because it is reading an invalid memory area.

What should we do? We should directly use the malloc function to allocate a memory area to the parameters to be passed, pass the pointer to the thread, use it after receiving the thread, and release it free when the thread exits.

Let's look at the example:

// This struct is the parameter table typedef struct _ clisten_listenaccepttask_param _ {linux_win_socket m_nsocket; // other parameters... ...} Break; // The habitual writing method. After the struct is set, the size of the struct is declared immediately, providing convenience for the subsequent malloc. Const ulong sclistenaccepttaskparamsize = sizeof (sclistenaccepttaskparam); // The connection request is received, apply for the parameter area, and bring the key information to the parameter area to help the thread work in the future. Bool clisten: listentaskcallback (void * pcallparam, Int & nstatus) {// normal function logic... ... // Assume that S is the socket from accept and requires passing in subsequent thread work // prepare a parameter area and apply for callback * pparam = (sclistenaccepttaskparam *) malloc (sclistenaccepttaskparamsize) from the remote stack ); // assign pparam-> m_nsocket = s to the parameter area; // start the thread here and pass pparam to the thread... ... // Normal function logic... ...} // This is a thread function that processes the socket bool clisten: listenaccepttask (void * pcallparam, Int & nstatus) from the preceding accept statement {// The first sentence is to force pointer type conversion, obtain the externally passed parameter region sclistenaccepttaskparam * pparam = (sclistenaccepttaskparam *) pcallparam; // The normal function logic... ... // Before exiting, make sure that the resource is not leaked close (pparam-> m_nsocket); // disable socket free (pcallparam ); // free parameter region //... ... }
Iv. Memory bug

The misuse of memory and pointers without rules will lead to a large number of bugs. programmers should maintain high sensitivity and attention to memory usage and use memory resources with caution.

The most common bug in memory usage is:

(1) Bad pointer value error: use it to reference the memory before the pointer value is assigned, or send a bad pointer to the library function, the third possible cause for a bad pointer is to release the pointer and then access its content. You can modify the free statement and set it to null after the pointer is released.

free(p); p = NULL;

In this way, if you continue to use the pointer after it is released, at least the program can perform information dumping before it is terminated.

(2) overwrite error: data is written beyond the array boundary, and data is written out of both ends of the dynamically allocated memory, or rewrite some heap management data structures (it is easy to write data in the area before the memory is dynamically allocated)

p = malloc(256); p[-1] = 0; p[256] = 0;

(3) pointer release error: release the same memory block twice, or release a memory that has not been allocated using malloc, or release the memory that is still in use, or release an invalid pointer. An extremely common error related to memory release is to iterate a linked list in a loop such as for (P = start; P = p-> next, and use the free (p) Statement in the loop body. In this way, in the next loop iteration, the program will unreference the released pointer, leading to unexpected results.

We can iterate like this:

struct node *p, *tart, *temp;for(p = start; p ; p = temp){    temp = p->next;    free(p);}

Summary: I have summarized these knowledge in my recent book. Many of them may be personal subjective. Welcome to shoot bricks...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.