Design the port communication server (iocp socket server) (2) Memory Management (AWE)

Source: Internet
Author: User

Design the port communication server (iocp socket server)

(2) Memory Management (AWE)

 

Copyright 2009 Code client (LU yigui) All Rights Reserved

QQ: 48092788 source code blog: http://blog.csdn.net/guestcode

 

Some cool once said that the server is playing with memory. Think about it. The memory requirements of servers are huge, and the memory requirements are also demanding. How to make a qualitative leap in server performance through memory management is the primary solution in server design.

Speaking of memory, I think the person who just started designing the server will say, isn't it necessary to apply for release? What's the problem. In terms of the operation steps, there are indeed no more work. When we use virtual memory allocation or heap allocation to obtain memory from the operating system, we always think that we have enough memory to make the server work with peace of mind. But it is not that simple. The operating system can also use the physical memory allocated to you under certain conditions. It will copy your physical memory data to the page swap file, then, you can allocate the physical memory to other processes. When your process accesses the data of the virtual address set you have obtained, it will find another physical memory that is empty (or perhaps requisitioned from another process), and then call up your original data from the page swap file to put it back into the new physical memory, and map the physical memory to the virtual memory address set you applied for (For details, refer to the memory management of the operating system ). This process is quite CPU-consuming and very slow, especially for reading and writing hard disk virtual memory files. Other reasons I will not mention in this article. You can learn about the operating system memory management principles from books such as Windows core programming, Windows operating systems, and operating systems.

We can use the lookaside lists technology to reuse the allocated memory, or use setworkingsetsize to set the flag to inform the operating system not to swap my memory, but only one more operation. I have never studied how much CPU resources are consumed by this operation, but from the perspective of performance requirements, there is nothing more to do with it. The memory management discussed in this article will use awe (address window extension) technology to reserve the applied physical memory as non-Paging memory, which will not be exchanged by Page Swap files, for more information about awe, see the books mentioned above. (The "Memory Management" mentioned below will be used only for memory management function modules of applications (hereinafter referred to as the Memory Manager) that are not mentioned above .)

There are two ways to measure the performance of the Memory Manager: one is the efficiency of memory allocation (allocation efficiency), the other is the efficiency of memory return (release efficiency), that is, the timeliness of the two operations, the shorter the time, the higher the efficiency. In the following discussion, it is assumed that the Memory Manager uses the page as the minimum allocation unit, and the page size is appropriate.

First, let's talk about allocation efficiency (the method mentioned below is just my inductive method, not an academic algorithm ):

1. Single-Chain phenotype

That is to say, all idle memory blocks (I .e. idle memory fragments, hereinafter referred to as idle fragments) form an idle fragmentation linked list. When applying for memory allocation, search for the required memory from the linked list header or split it from large fragments. This method is simple, but if there are too many idle fragments that are less than the application requirements, a large number of cyclic operations are required.

The arrangement of Single-Chain phenotype can be divided:

A is arranged by address (N)

Bserver fragments are arranged in ascending order

C is not arranged (first-in-first-out)

2. Multiple linked lists

In fact, the multi-chain table is a single-chain table that is arranged in the form of B and is composed of multiple linked list sets based on certain sizes and grades (see the books mentioned above for relevant algorithms ).

Multiple linked lists are classified into A and B.

Table A is as follows:

 

 

From the table above, we can see that 0 ~ <4 K lista has eight shard nodes; 4 K ~ <8 K listb has four shard nodes: 4 K, 6 K, 7 k, and 7 K; 8 K ~ <16 K listc is empty; 16 K ~ <24 k listd has four shard nodes: 16 K, 19 k, 22 K, and 23 K.

If you want to allocate a 5 K memory, you can directly search from listb (if listb is empty, continue to search for the larger space linked list) until the listn at the bottom. In this way, we can avoid traversing lista. If there are many lista fragments, we can save more time.

 

 

Table B below

 

 

 

The situation on the right is not much said, basically the same as that of type. What is the ing table on the left? The ing table is always empty. If lista is not empty, a points to lista. If lista is empty, it points to listb, and so on until the listn at the bottom. If you want to allocate 9 K of memory, directly retrieve the chain table header from C, in fact it points to the listn, so when n = 1000 +, the time saved will be much larger than that of type.

Let's look at the release efficiency:

1. 1st types of memory allocation Solutions

Release steps:

A. Find the idle fragments that are connected to the address lower than the address used to release the memory block, merge them with them, and re-arrange them;

B. Whether a is set up or not, find the idle fragments that are connected to the address higher than the address of the memory block to be released, merge them with them, and re-arrange them;

If C is not true for a and B, the idle fragment linked list is inserted according to the arrangement rules.

In the above steps, we found that idle fragments are very inefficient when they are massive.

2 Memory Allocation Solution 2nd types

A finds the idle fragments from lista to listn that are less than the address of the memory block to be released, and merges them with them and sorts them again;

B, whether a is set up or not, searches from lista to listn for idle fragments that are connected to the address higher than the address for releasing the memory block, and then merges them with them and re-arranges them;

C If A and B are not true, find the archive linked list based on the size of the released memory, and insert the idle part linked list according to the arrangement rules.

In this case, the workload is much larger than 1.

3 memory block linked list method

This linked list is not a list of idle fragments mentioned above, but a list of two-way memory blocks sorted by address from low to high regardless of memory usage or idle memory. Of course, we cannot sort all the memory blocks when releasing them, which is quite inefficient.

How to arrange this linked list can be worked out during the allocation:

Pblock is an idle shard.

......

If (pblock-> dwsize)

{// If the size of the idle block is greater

// Cut out a large part to reduce the block capacity

Pblock-> dwsize-= dwsize;

// Return the allocated address

Result = (pgmem_block) pblock-> paddr;

// The idle block points to the new idle address.

Pblock-> paddr = (char *) Result + dwsize;

// Obtain a new node

Pgmem_block pTMP;

PTMP = pmbgmemnodepool;

Pmbgmemnodepool = pmbgmemnodepool-> pmbnext;

// Insert the allocated block in front of the idle Block

PTMP-> paddr = result; // allocated memory block address

PTMP-> dwsize = dwsize; // size of the allocated memory block

PTMP-> pmbnext = pblock; // The next Pointer Points to the split idle block.

PTMP-> pmbprior = pblock-> pmbprior;

If (pTMP-> pmbprior)

PTMP-> pmbprior-> pmbnext = pTMP;

Pblock-> pmbprior = pTMP;

......

}

According to the above Code, only a small amount of code can be allocated to complete the arrangement requirements, this small expense, can play a very high efficiency in the release:

 

The memory block chain table sequence is abcdefg, And the idle block chain table sequence is EBG. Calculate the address of the block to be released (freeaddr-addrhead)/pagesize to locate the block on the page, we can find that the position of the memory block to be released in the memory block linked list is F. Through the up and down pointer, we can know that the two adjacent blocks to F are EG, and determine whether they are idle Blocks Based on the flag, then, the three blocks are merged through the two-way linked list operation, so that the release and merge operations can be easily completed without any traversal operation. Of course, if neither of the adjacent blocks is idle, the idle block queue is inserted according to the arrangement rules. It is estimated that there is no better algorithm than this to release the merged memory.

After the above method introduction, we should be clear about what we should do at this time. If you use the Memory Allocation Solution of 2nd "Multi-linked list type" and the memory release solution of 3rd "memory block linked list method", the memory manager must be superior. So we started coding ...... A headache is found. Even if the memory release work is processed by independent threads, the arrangement of idle memory blocks still consumes a lot of time. At this time, I cannot help wondering: What are we doing? Do I use such a powerful memory manager for operating systems? No, we are not operating the system. What else can we do to make memory management easier?

The following table shows the dynamic memory demand of an Application Server:

 

After reading the above table, I finally realized that the most types of memory are F and G, and it seems very open. If the memory is sufficient, I will set the page size of the Memory Manager to B, even if the idle memory blocks are not arranged, the memory allocation efficiency is extremely high. But how can we solve the most frequently used memory requirement for Type D? Any idea that you only want to use a memory manager to make the Server Ready is wrong. Memory requirements must be categorized. To address this 4 K memory requirement, you can use a fixed-size memory pool to solve the problem, as long as the requirement for this memory type needs to be operated in this memory pool, for example, the buffer for socket I/O operations (will be introduced later ).

Based on the above discussion, 1st types of memory allocation solutions are arranged in C mode and 3rd types of memory release solutions are adopted to meet dynamic memory allocation needs of an unfixed size, to a certain extent, the performance requirements can be achieved. Of course, the 2nd types of B in the memory allocation scheme are not necessarily necessary, and they consume more CPU resources.

There is no absolute method for memory management, so the above content only discusses how to design a memory manager based on memory requirements.

Routine:

 

 

 

Source code description:

Gmem. cpp and gmem. h are source code files of memory management units. The process source code is not optimized to achieve the purpose of the test. The test data provided is for reference only and has no substantive significance.

 

Download routine source code:

Http://download.csdn.net/source/1607811

 

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/GuestCode/archive/2009/08/27/4488402.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.