Directory of this document
memcached Efficient use of resources in data deduplication
· The data doesn't really disappear from the memcached.
· Lazy Expiration
· LRU: The principle of effectively deleting data from the cache
The latest development direction of memcached
· About the binary protocol
· Format of the binary protocol
· A striking place in the header
External engine support
· The need for external engine support
· The key to the success of simple API design
· Re-examine the current system
· Summarize
Memcached is a cache, so data is not permanently stored on the server, which is the precondition for introducing memcached into the system. This article introduces the data deletion mechanism of memcached and the latest development direction of memcached-binary protocol (binary Protocol) and external engine support.
memcached Efficient use of resources in data deduplication
The data doesn't really disappear from the memcached.
As described in the previous article, memcached does not release allocated memory. After the record times out, the client can no longer see the record (invisible, Transparent), and its storage space can be reused.
Lazy Expiration
Memcached internally does not monitor whether the record is out of date, but instead looks at the timestamp of the record at get and checks whether the record is out of date. This technique is called lazy (lazy) expiration. As a result, memcached does not consume CPU time on outdated monitoring.
LRU: The principle of effectively deleting data from the cache
Memcached takes precedence over the space of a record that has timed out, but even so, there is a lack of space when appending a new record, and a space is allocated using the Least recently Used (LRU) mechanism. As the name implies, this is the mechanism for deleting "least recently used" records. Therefore, when memcached has insufficient memory space (when it cannot get new space from the Slab Class), it searches from records that have not been used recently and allocates its space to new records. From a practical point of view of caching, the model is ideal.
In some cases, however, the LRU mechanism can cause trouble. The memcached can be disabled by the "-M" parameter at startup, as shown below:
$ memcached-m-M 1024
It is important to note at startup that the lowercase "-m" option is used to specify the maximum memory size. The default value of 64MB is used if no specific value is specified.
Specifies that when the "-M" parameter is started, memcached returns an error when the memory is exhausted. In other words, memcached is not a memory, but a cache, so it is recommended to use LRU.
The latest development direction of memcached
There are two major targets on the roadmap of Memcached. One is the planning and implementation of the binary protocol, and the other is the loading function of the external engine.
About the binary protocol
The reason for using the binary protocol is that it does not need the parsing of the text protocol, so that the performance of the original high-speed memcached is higher and the vulnerability of the text protocol can be reduced. Most implementations are currently in use, and the functionality is already included in the code base for development. A link to the code base is available on the memcached download page.
Http://danga.com/memcached/download.bml
Format of the binary protocol
The package for the protocol is a 24-byte frame followed by a key and unstructured data (unstructured). The actual format is as follows (quoted from the Protocol document):
byte/0|1|2|3| / | | | | |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7| +---------------+---------------+---------------+---------------+0/HEADER// / / / / / +---------------+---------------+---------------+---------------+ -/Command-specific EXTRAS ( asNeeded)/+/(note lengthinchTh extras length header field)/+---------------+---------------+---------------+---------------+m/Key ( asNeeded)/+/(note lengthinchKey Length header field)/+---------------+---------------+---------------+---------------+N/Value ( asNeeded)/+/(note length isTotal body length header field, minus/+/sum of the extras and key length body fields)/+---- -----------+---------------+---------------+---------------+ Total -bytes
As shown above, the package format is simple. It is important to note that the header (header), which occupies 16 bytes, is divided into two types: the request header and the response header (Response header). The header contains information such as Magic Bytes, command type, key length, value length, and so on, which represent the validity of the package, in the following format:
Request Header Byte/0|1|2|3| / | | | | |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7| +---------------+---------------+---------------+---------------+0| Magic | Opcode | Key Length | +---------------+---------------+---------------+---------------+4| Extras Length | Data Type | Reserved | +---------------+---------------+---------------+---------------+8| Total Body Length | +---------------+---------------+---------------+---------------+ A| Opaque | +---------------+---------------+---------------+---------------+ -| CAS | | | +---------------+---------------+---------------+---------------+
Response Header Byte/0|1|2|3| / | | | | |0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7| +---------------+---------------+---------------+---------------+0| Magic | Opcode | Key Length | +---------------+---------------+---------------+---------------+4| Extras Length | Data Type | Status | +---------------+---------------+---------------+---------------+8| Total Body Length | +---------------+---------------+---------------+---------------+ A| Opaque | +---------------+---------------+---------------+---------------+ -| CAS | | | +---------------+---------------+---------------+---------------+
If you want to know the details of each section, you can checkout the code tree of the memcached binary protocol, referring to the Protocol_binary.txt document in the Docs folder.
A striking place in the header
After seeing the header format, my impression is that the upper limit of the key is too big! In the current memcached specification, the key length is up to 250 bytes, but the size of the key in the binary protocol is expressed in 2 bytes. Therefore, it is theoretically possible to use a key with a length of 65536 bytes (216). Although keys of more than 250 bytes are not too common, a large key can be used after the binary protocol is released.
Binary protocols are supported starting from the next version 1.3 series.
External engine support
Last year I experimented with the memcached storage layer as a scalable (pluggable).
http://alpha.mixi.co.jp/blog/?p=129
When Brian Aker of MySQL saw the makeover, he sent the code to the Memcached mailing list. Memcached developers are also very interested, put in the roadmap. It is now co-developed by me and memcached developer Trond Norbye (specification design, implementation and testing). Time lag is a big problem when collaborating with foreign developers, but with the same vision, the prototype of an extensible architecture can finally be published. The code base can be accessed from the memcached download page.
The need for external engine support
The world has many memcached derived software, the reason is to want to permanently save data, to achieve data redundancy, even at the expense of some performance. Before I developed memcached, I had also considered reinventing memcached in Mixi's research and development department.
The loading mechanism of the external engine can encapsulate the complex processing of memcached network function, event processing and so on. As a result, the difficulty of working with memcached and storage engines by means of coercion or re-engineering at this stage will dissipate, and it will be easy to try various engines.
The key to the success of simple API design
The most important thing in this project is API design. Too many functions can make the engine developer feel trouble, too complex, and the threshold for implementing the engine will be too high. Therefore, the initial version of the interface function is only 13. The specifics are limited to space, which is omitted here, just to illustrate what the engine should do:
Engine information (version, etc.)
Engine initialization
Engine off
Statistics for engines
In terms of capacity, test whether a given record can be saved
Allocating memory for the item (record) structure
Release the memory of item (record)
Deleting records
Save Record
Recycling Records
Update the timestamp of a record
Mathematical operations Processing
Flush of data
Readers interested in detailed specifications can checkout the code of the Engine project, engine.h in the reader.
Re-examine the current system
The difficulty with memcached supporting external storage is that the code associated with the Network and event processing (the core server) is tightly tied to the code in memory storage. This phenomenon is also known as tightly coupled (tightly coupled). The memory-stored code must be isolated from the core server to be flexible enough to support the external engine. Therefore, the api,memcached based on our design is re-formed as follows:
After refactoring, we compared performance with the 1.2.5 version, binary protocol support, and so on, confirming that it does not cause performance impact.
When considering how to support an external engine load, it is easiest to make memcached parallel control (concurrency control), but for the engine, parallel controls are the true essence of performance, so we have adopted a design that gives the multithreading support to the engine completely.
Future improvements will make the memcached more widely used.
Summarize
This paper introduces the time-out principle of memcached, how to delete data inside, and then introduces the newest development direction of memcached such as binary protocol and external engine support. These features will not be supported until the 1.3 version, please look forward to!
The next section will introduce memcached application knowledge and application compatibility.
Memcached Complete Anatomy Series Tutorial "Turn" memcached Complete Anatomy Series Tutorial –3.memcached removal mechanism and development direction