/**
* Author: Zhang Ronghua
* Date: 2008-08-08
**/
Memcached, people are known remote distribute cache (do not know can javaeye, or Google, or Baidu a bit, but in view of Baidu's ranking business taste too strong (from the recent events can be seen), Therefore, it is recommended to javaeye the next), use it is very simple, it is used in a lot of sites above, very few large sites will not use memcached.
I have also seen a lot of analysis of memcached internal mechanism of the article, a little harvest, but after reading and forget, and there is no profound concept, but recently I encountered a problem, this question forced me to get to know memcache, the following I explain the problems I encountered
Question: I have tens of millions of of the data, this data will often be used, at present, it must be placed in memcached to ensure access speed, but my memcached data are often lost, and business requirements are memcached data is not lost. When my data was lost, the memory of Memcached server was used to 60%, that is, 40% of the memory was seriously wasted. But not all applications are like this, and other application memory is wasted less. Why the memory is used to 60% when the LRU on the execution of it (it is determined that LRU execution is because I found that my data is always in front of the missing, and this process, these data are not accessed, such as the first visit, only access to article 1000w, The first 300w or previous data has been lost, from the log, the 300w must be put in the.
With these questions, I began to re-examine the memcached product, starting with its memory model: we know that there are two ways to allocate memory in C + +, to allocate and dynamically allocate, and obviously, to allocate memory in advance will make the program faster, but its disadvantage is that it is not efficient to use memory, and dynamic allocation can effectively use memory, but will make the program running efficiency down, memcached memory allocation is based on the above principle, obviously in order to get faster speed, sometimes we have to change time in space.
That is to say, memcached will allocate memory beforehand, by the way, memcached allocates memory as allocator, first of all, here are 3 concepts:
1 slab
2 page
3 Chunk
To explain, generally speaking, a memcahced process will be divided into a number of slab, each slab under a number of page, each page has more than one chunk, if we think of these 3 strokes as an object, this is two a one-to-many relationship. In general, the number of slab is limited, several, more than 10, or dozens of, which is related to the process configuration memory. and each slab page default is 1m, that is, if a slab occupies 100m of memory, then by default this slab has the number of page is 100, and chunk is our data stored in the final place.
To give an example, I start a memcached process, occupy memory 100m, and then open telnet,telnet localhost 11211, after the connection memcache, enter stats slabs, carriage return, the following data appears:
Java code stat 1:chunk_size 80 stat 1:chunks_per_page 13107 STAT 1: total_pages 1 stat 1:total_chunks 13107 STAT 1:used_chunks 13107 stat 1:free_chunks 0 stat 1:free_chunks_end 13107 stat 2:chunk_size 100 stat 2:chunks_per_page 10485 STAT 2:total _pages 1 stat 2:total_chunks 10485 stat 2:used_chunks 10485 stat 2:free_chunks 0 stat 2:free_chunks_end 10485 STAT 3:chunk_size 128 stat 3:chunks_per_page 8192 Stat 3:total_ pages 1 stat 3:total_chunks 8192 stat 3:used_chunks 8192 stat 3:free_chunks 0 stat 3:free_chunks_end 8192
STAT 1:chunk_size
STAT 1:chunks_per_page 13107
STAT 1:total_pages 1 STAT 1:total_chunks 13107 STAT
1 : Used_chunks 13107
STAT 1:free_chunks 0 STAT 1:free_chunks_end 13107 STAT 2:chunk_size
-STAT
2: Chunks_per_page 10485
STAT 2:total_pages 1 STAT 2:total_chunks 10485 STAT 2:used_chunks 10485
STAT 2 : free_chunks 0
STAT 2:free_chunks_end 10485
STAT 3:chunk_size
128 STAT 3:chunks_per_page 8192 STAT 3:total_pages 1
STAT 3:total_chunks 8192 STAT 3:used_chunks 8192 STAT
3:free_chunks 0
STAT 3: Free_chunks_end 8192
These are the top 3 slab details.
Chunk_size represents the size of the data store block, Chunks_per_page represents the number of chunk in a page of memory pages, and Total_pages represents the number of page numbers under each slab. Total_chunks said the total number of chunk under this slab (=total_pages * chunks_per_page), used_chunks indicated that slab had been used under chunk, free_ Chunks indicates that the slab can also be used in chunks quantities.
From the above sample slab 11 total 1m memory space, and has now been used up, SLAB2 also have 1m of memory space, also used up, SLAB3 is still the case. And from these 3 slab chunk size can be seen, the first chunk is 80b, the second is 100b, the 3rd is 128b, basically the last one is 1.25 times times the first, but this increase in the situation we can control, we can through the start of the process parameters- F to modify this value, for example, –f 1.1 means that the growth factor is 1.1, then the first slab in the chunk is 80b, the second slab chunk should be about 80*1.1.
Explain so much also should be able to see the cause of the problem I encountered, if I can not see it, then I add the key: Memcached in the new value to store the address is the size of the value of the decision, Value is always chosen to be stored in the chunk with the nearest slab, such as the example above, if my value is 80b, all of my value will always be stored in number 1th slab, and the free_chunks in number 1th slab is 0. What to do if you do not append-M when you start memcached (the LRU is out of memory when there is not enough memory in this case), then memcached will clear the data from the most recently used slab in this chunk and put the latest data on it. This explains why my memory is still 40% when LRU executes, because the chunk_size in my other slab is much greater than my value, so my value will not be placed in those slab, And will only be placed in the slab with the closest chunk to my value (and these slab are already full, depressed). This has caused my data to be constantly covered, the latter covering the former.
The problem is found, the solution is still not found, because my data must be 100% of the hit ratio, I can only adjust the slab growth factor and page size to try to get the hit rate close to 100%, but can not 100% guarantee the hit number is 100% (this is so awkward to read it? Self-review of their language proficiency, if you say, this solution is not ah, because my memcached server can not stop Ah, it does not matter there is another way, is Memcached-tool, execute the move command, such as: Move 3 1, the representative moves a memory page in number 3rd to slab 1th Slab, someone asked, this is what use, for example, my 20th slab utilization is very low, but page but also a lot, such as 200, then 200m, and 2 good slab often occur LRU, obviously page is not enough , I can move 20 2, take a memory page of 20th slab to the 2nd number slab, so that I can use the memory more effectively (someone said, only move one page at a time, how much trouble ah.) Ahuaxuan said, write a script, loop it.
Some people say no ah, my memcache data can not be lost ah, OK, try Sina memcachedb bar, although I have not used, but suggest you can try, it also makes use of memcache agreement and BerkeleyDB do (write here, I have to admire Danga, I think its biggest contribution is not memcache server itself, but memcache agreement, it is said that it is used in many applications of Sina, including Sina's blog.
Add, the stats Slab command allows you to view slab in memcached, and the stats command can view some of your memcached health conditions, such as the hit rate, as shown in the following example:
Java Code stat pid 2232 stat uptime 1348 stat time 1218120955 stat version 1.2.1 stat pointer_size 32 STAT curr_ items 0 stat total_items 0 stat bytes 0 STAT curr_connections 1 stat total_connections 3 STAT connection_ structures 2 stat cmd_get 0 stat cmd_set 0 STAT get_hits 0 stat get_misses 0 stat bytes_read 26 stat bytes_written 16655 stat limit_maxbytes 104857600
STAT pid 2232
STAT uptime 1348
STAT time 1218120955
STAT version 1.2.1
STAT
pointer_size STAT _items 0
STAT total_items 0
STAT bytes 0
STAT curr_connections 1
STAT total_connections 3
STAT Connection_structures 2
STAT cmd_get 0
STAT cmd_set 0
STAT get_hits 0
STAT get_misses 0
STAT Bytes_read
STAT bytes_written 16655
STAT limit_maxbytes 104857600
From the above data can see this memcached process of the hit rate is very good, get_misses low up to 0, how Ah, because this process so I just started, I only use Telnet, so curr_connections for 1, and Total_ Items for 0, because I did not put the data in, Get_hits for 0, because I did not call get method, the final result is misses of course 0, wow, in other words, the hit rate is 100%, and yy.
It's time to conclude, from this article we can get the following conclusions:
Conclusion One, memcached LRU is not global, but slab, it can be said to be regional.
Conclusion two, to increase the hit rate of memcached, it is necessary to estimate our value size and adjust the memory page size and growth factor appropriately.
Conclusion three, with questions to find answers to understand more than casual look at the effect of much better.