Understanding the memory storage of memcached
The following is the second part of the comprehensive anatomy of memcached.
- Release Date: 2008/7/9
- Former Sakamoto (Toru Maesaka)
- Original link: http://gihyo.jp/dev/feature/01/memcached/0002
I am the former mixi of the research and Development Group of the corporation. The last article described the memcached as a distributed cache server. This article will introduce the implementation of the internal structure of memcached and how the memory is managed. In addition, weaknesses caused by the internal structure of the memcached will also be explained.
Slab allocation mechanism: Defragment memory for reuse
The most recent memcached by default uses a mechanism called slab allocator to allocate and manage memory. Prior to the advent of this mechanism, the allocation of memory was performed simply by malloc and free for all records. However, this approach can lead to memory fragmentation, aggravating the burden on the operating system memory manager, and in the worst case, cause the operating system to be slower than the memcached process itself. Slab Allocator was born to solve the problem.
Let's look at the principle of slab allocator. The following is the goal of slab allocator in the memcached documentation:
The primary goal of the slabs subsystem in memcached is to eliminate memory fragmentation issues totally by using fixed-s Ize memory chunks coming from a few predetermined size classes.
In other words, the basic principle of the Slab allocator is to divide the allocated memory into blocks of a specific length in a predetermined size to completely resolve the memory fragmentation problem.
The principle of Slab allocation is quite simple. Divides the allocated memory into blocks of various sizes (chunk),
and divide the same size blocks into groups (chunk) (Figure 1).
Fig. 1 structure diagram of Slab allocation
Also, slab allocator has the purpose of reusing allocated memory. In other words, the allocated memory is not freed, but reused.
Main terms of Slab allocation
Page
The memory space allocated to slab, which is 1MB by default. After assigning to slab, the slab is divided into chunk according to the size of the.
Chunk
The memory space used to cache records.
Slab Class
A group of chunk of a specific size.
The principle of caching records in slab
The following shows how memcached selects slab and caches the data sent by the client to chunk.
Memcached based on the size of the data received, select the slab that best fits the data size (Figure 2).
The list of idle chunk in slab is saved in memcached, and Chunk is selected based on the list,
The data is then cached in it.
Figure 2 How to select a group to store records
In fact, Slab allocator also has advantages and disadvantages. Here's a look at its drawbacks.
Disadvantages of Slab Allocator
Slab allocator solved the original memory fragmentation problem, but the new mechanism also brought new problems to memcached.
The problem is that the allocated memory cannot be effectively exploited because it allocates memory of a specific length. For example, by caching 100 bytes of data into a 128-byte chunk, the remaining 28 bytes are wasted (Figure 3).
Figure 3 Use of chunk space
There is no perfect solution for this problem, but the more effective solution is documented in the documentation.
The most efficient-on-the-waste-to-use-a list of size classes that closely matches (if that's at all Possibl e) Common sizes of objects, the clients of this particular installation of memcached is likely to store.
That is, if you know in advance the common size of the data sent by the client, or if you only cache data of the same size, you can reduce waste if you use a list of groups that fit the data size.
Unfortunately, it is not possible to perform any tuning at this time, only to look forward to future versions. However, we can adjust the difference in the size of the slab class. Next, the growth factor option is described.
Tuning with growth factor
memcached specifies the growth factor factor at startup (via the-f option),
The difference between the slab can be controlled to some extent. The default value is 1.25. However, before this option occurs, this factor was once fixed to 2, called the "Powers of 2" policy.
Let's try using the previous settings to start memcached in verbose mode:
-f 2 -vv
The following is the verbose output after startup:
SlabClass 1:ChunkSize 128Perslab 8192slabClass 2:ChunkSize 256Perslab 4096slabClass 3:ChunkSize 512Perslab 2048slabClass 4:ChunkSize 1024Perslab 1024slabClass 5:ChunkSize 2048Perslab 512slabClass 6:ChunkSize 4096Perslab 256slabClass 7:ChunkSize 8192Perslab 128slabClass 8:ChunkSize 16384Perslab 64slabclass 9: chunk size 32768 perslab 32slab class: Chunk size 65536 perslab 16slab class: chunk size 131072 perslab 8slab class: Chunk size 262144 perslab 4slab class: chunk size 524288 perslab 2
As you can see, starting with a 128-byte group, the size of the group increases to twice times the original. The problem with this setting is that the difference between slab is large, and in some cases it is quite a waste of memory. Therefore, to minimize memory waste, two years ago this option was appended with growth factor.
Take a look at the current default settings (f=1.25) when the output (space limit, this is only written to the 10th group):
SlabClass 1:ChunkSize 88Perslab 11915slabClass 2:ChunkSize 112Perslab 9362slabClass 3:ChunkSize 144Perslab 7281slabClass 4:ChunkSize 184Perslab 5698slabClass 5:ChunkSize 232Perslab 4519slabclass 6: chunk size 296 perslab 3542slab class 7: chunk size 376 perslab 2788slab class 8: chunk Size 472 perslab 2221slab class 9: Span class= "Hljs-title" >chunk size 592 perslab 1771slab class: chunk size 744 perslab 1409
As can be seen, the gap between groups is much smaller than the factor of 2 o'clock, which is more suitable for caching hundreds of-byte records. From the above output, you may find some calculation errors, which are deliberately set to keep the number of bytes aligned.
When introducing memcached into a product or deploying it directly using default values, it is best to recalculate the expected average length of the data and adjust the growth factor to get the most appropriate settings. Memory is a precious resource, and a waste of it is too bad.
Here's how to use memcached's stats command to see a wide variety of information such as slabs utilization.
View the internal state of the memcached
Memcached has a command called stats that can be used to obtain a wide variety of information.
There are many ways to execute commands, with Telnet the simplest:
$ telnet 主机名 端口号
After connecting to memcached, enter stats and press ENTER to get a variety of information including resource utilization. Also, enter "stats slabs" or "stats items" to get information about the cache record. To end the program, enter quit.
The details of these commands can refer to the Protocol.txt documentation within the memcached package.
$ telnet localhost11211Trying::1...Connected to localhost. Escape character is' ^] '. Statsstat pid481STAT Uptime16574STAT time1213687612STAT version1.2.5STAT pointer_size32STAT Rusage_user0.102297STAT rusage_system 0.214317STAT curr_items 0STAT total_items 0STAT bytes 0STAT curr_ Connections 6STAT total_connections 8STAT connection_structures 7STAT cmd_get 0STAT cmd_set 0STAT get_hits 0STAT get_misses 0STAT evictions 0STAT bytes_read 20STAT bytes_written 465STAT limit_maxbytes 67108864STAT threads 4ENDquit
Additionally, if you install libmemcached, the client library for the C + + language, the MemStat command will be installed. Using the method is simple, you can get the same information as Telnet with fewer steps, and you can get information from multiple servers at once.
$ memstat --servers=server1,server2,server3,...
Libmemcached can be obtained from the following address:
- Http://tangent.org/552/libmemcached.html
View the usage status of slabs
Using memcached, a Perl script named Memcached-tool, written by Brad, makes it easy to get slab usage (IT organizes memcached return values into easy-to-read formats). You can get the script from the following address:
- Http://code.sixapart.com/svn/memcached/trunk/server/scripts/memcached-tool
The method of use is also extremely simple:
$ memcached-tool 主机名:端口 选项
You do not need to specify options when viewing slabs usage, so you can use the following command:
$ memcached-tool 主机名:端口
The information obtained is as follows:
# item_size Max_age 1mb_pages Count full?1104 B1394292 s121512249628Yes2136 B1456795 s52400919Yes3176 B1339587 s33196567Yes4224 B1360926 s109510221Yes5280 B1570071 s49183452Yes6352 B1592051 s77229197Yes7B1517732 s66157183Yes8552 B1460821 s62117697Yes9696 B1521917 s143215308Yes10872 B1695035 s205246162Yes111.1 KB1681650 s233221968Yes121.3 KB1603363 s241183621Yes 1.7 kb 1634218 s 94 57197 Yes 2.1 kB 1695038 s 36488 Yes 2.6 kb 1747075 s 25203 Yes 3.3 KB 1760661 s 24167 y Es
The meanings of each column are:
Column |
Meaning |
# |
Slab class Number |
Item_size |
Chunk size |
Max_age |
Lifetime of oldest record in LRU |
1mb_pages |
Number of pages assigned to slab |
Count |
Number of records in slab |
Full? |
Whether the slab contains idle chunk |
The information obtained from this script is very convenient for tuning and is highly recommended.
Summary of memory storage
This paper simply explains the caching mechanism and tuning method of memcached. I hope readers can understand the principle of memcached memory management and its advantages and disadvantages.
Next time, we will continue to explain the principles of LRU and expire, as well as the latest development direction of memcached-expandable system (pluggable Architecher).
memcached Comprehensive analysis of--2