Memcached comprehensive analysis-3. memcached's deletion mechanism and development direction. The third part of memcached comprehensive analysis is as follows. Published on: 2008716 author: ToruMaesaka original article link: gihyo. jpdevfeature01memcached0003 below is the third part of memcached comprehensive analysis.
Posting Date: 2008/7/16
Author: Toru Maesaka)
Link: http://gihyo.jp/dev/feature/01/memcached/0003
Here is the link to this series of articles:
- 1st times: http://www.phpchina.com/html/29/n-35329.html
- 2nd times: http://www.phpchina.com/html/30/n-35330.html
- 3rd times: http://www.phpchina.com/html/31/n-35331.html
- 4th times: http://www.phpchina.com/html/32/n-35332.html
- 5th times: http://www.phpchina.com/html/32/n-35333.html
- Memcached effectively uses resources for data deletion
- Data will not actually disappear from memcached
- Lazy Expiration
- LRU: how data is effectively deleted from the cache
- The latest development direction of memcached
- About the binary protocol
- Binary protocol format
- What is striking in the HEADER
- External engine support
- Necessity of external engine support
- Key to successful API Design
- Review the current system
- Summary
Memcached is a cache, so data is not permanently stored on the server. this is the prerequisite for introducing memcached to the system. This article introduces the data deletion mechanism of memcached and the latest development direction of memcached-Binary Protocol and external engine support.
Memcached's effective use of resource data in data deletion will not actually disappear from memcached
As mentioned last time, memcached does not release allocated memory. After the record times out, the client can no longer see the record (invisible, transparent) and its storage space can be reused.
Lazy Expiration
Memcached does not monitor whether the record expires, but checks the timestamp of the record during get to check whether the record expires. This technology is called lazy (inert) expiration. Therefore, memcached does not consume CPU time on expired monitoring.
LRU: how data is effectively deleted from the cache
Memcached will give priority to the space of records that have timed out, but even so, there will be insufficient space to append new records. in this case, we need to use the Least Recently Used (LRU) mechanism to allocate space. As the name suggests, this is a mechanism to delete records that are "least recently used. Therefore, when memcached has insufficient memory space (when the new space cannot be obtained from the slab class), it will be searched from the records that have not been used recently, and allocate the space to the new record. From the perspective of caching, this model is ideal.
However, in some cases, the LRU mechanism may cause problems. When memcached is started, LRU can be disabled through the "-M" parameter, as shown below:
$ memcached -M -m 1024
Note that the "-m" option in lower case is used to specify the maximum memory size. If no specific value is specified, the default value is 64 MB.
After the "-M" parameter is specified, memcached returns an error when the memory usage is exhausted. In other words, memcached is not a memory but a cache, so LRU is recommended.
The latest development direction of memcached
Memcached's roadmap has two major targets. One is the planning and implementation of the binary protocol, and the other is the loading function of the external engine.
About the binary protocol
The reason for using the binary protocol is that it does not need to parse the text protocol, so that the performance of the original high-speed memcached can be upgraded to the next level, and the vulnerability of the text protocol can be reduced. At present, most implementations have been implemented, and the code library for development has included this function. The download page of memcached contains a link to the code library.
- Http://danga.com/memcached/download.bml
Binary protocol format
The protocol package is a 24-byte frame followed by a key and Unstructured Data ). The actual format is as follows (reference protocol document ):
Byte/ 0 | 1 | 2 | 3 | / | | | | |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7| +---------------+---------------+---------------+---------------+ 0/ HEADER / / / / / / / +---------------+---------------+---------------+---------------+ 24/ COMMAND-SPECIFIC EXTRAS (as needed) / +/ (note length in th extras length header field) / +---------------+---------------+---------------+---------------+ m/ Key (as needed) / +/ (note length in key length header field) / +---------------+---------------+---------------+---------------+ n/ Value (as needed) / +/ (note length is total body length header field, minus / +/ sum of the extras and key length body fields) / +---------------+---------------+---------------+---------------+ Total 24 bytes
As shown above, the package format is very simple. It should be noted that the HEADER that occupies 16 bytes is divided into two types: Request Header and Response Header. The header contains the Magic byte, command type, key length, value length, and other information indicating the validity of the package. the format is as follows:
Request Header Byte/ 0 | 1 | 2 | 3 | / | | | | |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7| +---------------+---------------+---------------+---------------+ 0| Magic | Opcode | Key length | +---------------+---------------+---------------+---------------+ 4| Extras length | Data type | Reserved | +---------------+---------------+---------------+---------------+ 8| Total body length | +---------------+---------------+---------------+---------------+ 12| Opaque | +---------------+---------------+---------------+---------------+ 16| CAS | | | +---------------+---------------+---------------+---------------+
Response Header Byte/ 0 | 1 | 2 | 3 | / | | | | |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7| +---------------+---------------+---------------+---------------+ 0| Magic | Opcode | Key Length | +---------------+---------------+---------------+---------------+ 4| Extras length | Data type | Status | +---------------+---------------+---------------+---------------+ 8| Total body length | +---------------+---------------+---------------+---------------+ 12| Opaque | +---------------+---------------+---------------+---------------+ 16| CAS | | | +---------------+---------------+---------------+---------------+
To learn more about each part, you can checkout the code tree of the memcached binary protocol and refer to protocol_binary.txt in its docsfolder.
What is striking in the HEADER
What I think after seeing the HEADER format is that the upper limit of the key is too large! In the current memcached specification, the maximum key length is 250 bytes, but the size of the key in the binary protocol is 2 bytes. Therefore, theoretically, a key with a maximum length of 65536 bytes (216) can be used. Although keys larger than 250 bytes are not very commonly used, a huge key can be used after the binary protocol is released.
The binary protocol is supported in the next version 1.3 series.
External engine support
Last year I tried to transform the memcached storage layer into a scalable (pluggable ).
- Http://alpha.mixi.co.jp/blog? P = 129
After seeing this transformation, Brian Aker of MySQL sent the code to the memcached email list. Memcached developers are also very interested, so they are placed in roadmap. Now, Trond Norbye, a developer of memcached, is jointly developed (specification design, implementation, and testing ). The time difference with overseas collaborative development is a big problem, but with the same vision, we can finally publish the prototype of the scalable architecture. The code library can be accessed from the memcached download page.
Necessity of external engine support
There are many derivative memcached software in the world, on the grounds that they want to permanently store data and implement data redundancy, even at the cost of some performance. Before developing memcached, I had considered a new memcached invention in mixi's R & D department.
The external engine's loading mechanism can encapsulate memcached's network functions, event processing, and other complex processes. Therefore, the difficulty of cooperation between memcached and the storage engine is eliminated by means of forcible or re-design, and it is easy to try various engines.
Key to successful API Design
In this project, we pay the most attention to API design. Too many functions may cause engine developers to be in trouble. too complicated, the engine implementation threshold will be too high. Therefore, there are only 13 interface functions in the original version. The specific content is limited by space, which is omitted here. it only describes the operations that the engine should perform:
- Engine information (version, etc)
- Engine initialization
- Engine disabled
- Engine statistics
- In terms of capacity, test whether a given record can be saved
- Allocate memory for the item (record) structure
- Release item (record) memory
- Delete record
- Save record
- Reclaim records
- Timestamp of the update record
- Mathematical Processing
- Data flush
Readers interested in the detailed specifications can check the code of the checkout engine project and the engine. h in the reader.
Review the current system
Memcached is difficult to support external storage because the code (core server) related to network and event processing is closely related to the code stored in memory. This phenomenon is also known as tightly coupled (tightly coupled ). The code stored in memory must be independent from the core server to flexibly support external engines. Therefore, based on our designed API, memcached is re-constructed as follows:
After reconstruction, we compared the performance with 1.2.5 and binary protocol supported versions, and confirmed that it would not affect the performance.
When considering how to support external engine loading, it is easiest to enable memcached for concurrency control. but for the engine, parallel control is the true meaning of performance, therefore, we have adopted a design scheme that completely routes multithreading support to the engine.
Future improvements will make memcached more widely used.
Summary
This article introduces the memcached timeout principle and how to delete data internally, and introduces the latest development directions of memcached, such as binary protocol and external engine support. These functions will not be supported until version 1.3. please wait!
This is my last article in this serialization. Thank you for reading my article!
Next, Nagano will introduce the application knowledge and application compatibility of memcached.
Bytes. Published by: 2008/7/16 (Toru Maesaka) original article links: http://gihyo.jp/dev/feature/01/memcached/0003 this series...