During this time, the project encountered the problem of memory fragmentation. On a dual-core 4G machine, it is clear that the virtual memory of the process is only about MB, but new will throw the bad_alloc exception, view the memory usage of the Process Using process explorer. Although the virtual memory is only about MB, the allocated address space is nearly 2 GB, then, use vmmap to view the address space allocation of the process, and find that the heap memory allocation is very high. In the memory address, there is always a commit in the two reservers. Obviously, this is a characterization of serious fragments in the memory address space. The usual solution to the memory fragmentation problem is to use the memory pool. The simplest memory pool is to pre-allocate a large continuous memory and split it into small memory blocks with the most frequent use of a fixed size. However, because some large and complex open-source libraries are used in this project, it is difficult to locate the relevant Code in terms of time and effort. So I decided to start with the standard memory allocation routine of the Runtime Library. The original method is to implement a memory pool and then replace the CRT standard memory allocation routine. However, read the CRT Memory code and found that malloc is actually implemented through the windows heap API. We know that the windows heap architecture is divided into two parts: the front-end and the back-end, and the front-end heap is divided into two types, one is the Lal heap with High Performance implemented by the subquery list, and the other is the lfh heap. As the name suggests, the lfh heap is the bottom fragment heap, while the lfh can effectively reduce the memory fragment when small memory blocks are frequently allocated. The default CRT heap used in the CRT is the Lal heap, so with a try, I set the CRT heap to the lfh heap and run the program again, it is surprising that the utilization efficiency of the virtual memory address space is close to 100%. In the original process, only 500 sessions can be saved. Currently, 5000 sessions can be saved to meet project requirements. Below is a simple code that I wrote to set the lfh heap type:
[CPP]
View plaincopyprint?
- Bool setlowfragmentheaps ()
- {
- Enum {
- Lfhheap = 2
- };
- DWORD dwheapnum = getprocessheaps (0, null );
- Handle * pheaphandles = new handle [dwheapnum];
- DWORD dwnum = getprocessheaps (dwheapnum, pheaphandles );
- Ulong lheaptype = lfhheap;
- Bool bsuccess = true;
- Handle hcrtheap = (handle) _ get_heap_handle ();
- While (-- dwnum)
- {
- Bool Bret = heapsetinformation (pheaphandles [dwnum],
- Heapcompatibilityinformation,
- & Lheaptype, sizeof (lheaptype ));
- // For CRT heap, it must be set to lfh heap!
- If (! Bret & hcrtheap = pheaphandles [dwnum])
- {
- Bsuccess = false;
- Break;
- }
- }
- Delete [] pheaphandles;
- Return bsuccess;
- }
Bool extract () {Enum {lfhheap = 2}; DWORD dwheapnum = getprocessheaps (0, null); handle * pheaphandles = new handle [dwheapnum]; DWORD dwnum = getprocessheaps (dwheapnum, callback); ulong lheaptype = lfhheap; bool bsuccess = true; handle hcrtheap = (handle) _ trim (); While (-- dwnum) {bool Bret = heapsetinformation (pheaphandles [dwnum], heapcompatibilityinformation, & lheapt Ype, sizeof (lheaptype); // for CRT heap, it must be set to lfh heap! If (! Bret & hcrtheap = pheaphandles [dwnum]) {bsuccess = false; break ;}} Delete [] pheaphandles; return bsuccess ;}
Finally, I will talk about my views on the custom memory pool. Many people (including me) have always thought that to achieve efficient memory allocation, We have to implement a memory pool by ourselves, subconsciously think that the system's pre-provision facilities are insufficient to meet our performance requirements, but in all the situations I 've encountered, the system's provision facilities (whether it's a lock, a memory pool or something else) it is often better than our so-called efficient implementation. In most cases, the system facilities with hundreds of optimizations are sufficient to meet our needs. Therefore, when systems provide us with efficient and robust implementations, why re-invent the wheel?