Memcache frequently asked questions and answers

Last Update:2016-01-21 Source: Internet

Author: User

Tags cas memcached svn

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

What is the cache mechanism of memcached?

Memcached the main cache mechanism is the LRU (least recently used) algorithm + timeout failure. When you save data to memcached, you can specify how long the data can stay in the cache which is forever, or some time in the future. If Memcached's memory is not enough, the expired slabs will be replaced first, then the oldest unused slabs.

Memcached How to implement redundancy mechanism?
Not implemented! We were amazed at the problem. Memcached should be the cache layer of the application. Its design itself has no redundancy mechanism. If a memcached node loses all of its data, you should be able to retrieve it from the data source (such as the database) again. You should be particularly aware that your app should tolerate node failures. Do not write some bad query code, hope to memcached to guarantee everything! If you are concerned about the failure of the node to greatly increase the burden on the database, you can take some measures. For example, you can add more nodes (to reduce the impact of losing one node), Hot spare nodes (take over IP when other nodes are down), and so on.

Memcached How to handle fault-tolerant?
No Deal! :) In the case of memcached node failure, there is no need for the cluster to do any fault-tolerant processing. If a node fails, the measures to be taken depend entirely on the user. When a node fails, here are a few scenarios to choose from:

* Ignore it! There are many other nodes that can deal with the effect of node failure before the failed node is restored or replaced.

* Remove the failed node from the list of nodes. Be careful with this operation! By default (the remainder hash algorithm), the client adds or removes nodes, causing all cached data to be unavailable! Because the list of nodes for the hash reference changes, most of the keys are mapped to different nodes (as they were) because of the change in the hash value.

* Start the hot standby node and take over the IP occupied by the failed node. This prevents hash disturbances (hashing chaos).

* If you want to add and remove nodes without affecting the original hash results, you can use the consistent hashing algorithm (consistent hashing). You can baidu a consistent hashing algorithm. Clients that support consistent hashing are already mature and widely used. Go and try it!

* Two hashes (reshing). When the client accesses the data, if a node is found to be down, the hash is done again (the hash algorithm differs from the previous one), and the other node is re-selected (note that the client does not remove the down node from the node list and the next time it is possible to hash to it). If a node is good and bad, the two-hash method is risky, and dirty data may be present on both good and bad nodes.

How do I export memcached item in bulk?

You should not do this! The memcached is a non-blocking server. Any operation that could lead to a memcached pause or momentary denial of service should be worth pondering. Bulk importing data to memcached is often not what you really want! Imagine, if the cached data changes between export imports, you need to deal with dirty data, and if the cached data expires between export imports, what do you do with the data?

Therefore, exporting imported data in batches is not as useful as you might think. But it's very useful in a scene. If you have a large amount of data that is never changed and you want the cache to be hot (warm) quickly, it is helpful to bulk import the cached data. Although this scenario is not typical, it often happens, so we will consider the ability to implement bulk export imports in the future.

Steven Grimm, as always, gave another good example in the mailing list: http://lists.danga.com/pipermail/memcached/2007-July/004802.html.

But I do need to memcached the item in bulk export import, how to do??

All right, all right. If you need to export the import in bulk, the most likely cause is that it takes a long time to regenerate the cached data, or the database is bad for you to suffer.

If a memcached node is down to make you miserable, you'll get into a lot of other problems. Your system is too fragile. You need to do some optimization work. such as dealing with "surprise group" problem (such as memcached node is not valid, repeated queries to keep your database overwhelmed ...) This question is referred to in the other FAQ), or to optimize bad queries. Remember, Memcached is not an excuse to avoid optimizing your queries.

If your problem is simply to regenerate the cached data for a long time (15 seconds to more than 5 minutes), you might consider re-using the database. Here are a few tips:

* Use MogileFS (or similar software such as COUCHDB) to store the item. Calculate the item and dump it on the disk. The mogilefs can easily overwrite item and provide quick access to it. You can even cache the item in the MogileFS in memcached, which speeds up the read speed. The combination of mogilefs+memcached can speed up the response time of cache misses and improve the usability of the website.
* Re-use MySQL. MySQL's InnoDB primary key query is fast. If most of the cached data can be placed in a varchar field, the performance of the primary key query will be better. Querying by key from memcached is almost equivalent to MySQL's primary key query: Hashes the key to the 64-bit integer and stores the data in MySQL. You can store the original (not hash) key in a normal field, and then build a two-level index to speed up the query ... key passively fails, bulk delete invalid key, and so on.

All of the above methods can be introduced into memcached, which still provides good performance when restarting memcached. Because you don't need to be wary of the "hot" item being suddenly phased out by the memcached LRU algorithm, users no longer have to wait a few minutes for the cached data to regenerate (when the cached data suddenly disappears from memory), so the above method can improve overall performance.

For details on these methods, see the blog: http://dormando.livejournal.com/495593.html.

How does the memcached authenticate?
No identity authentication mechanism! Memcached is software that runs on the lower layer of the application (authentication should be the upper-level responsibility of the application). The memcached client and server side are lightweight, in part because the authentication mechanism is not implemented at all. In this way, memcached can quickly create new connections without any configuration on the server side.

If you want to restrict access, you can use a firewall, or let memcached listen to UNIX domain sockets.

What are the multithreading of memcached? How do I use them?
The thread is the law (threads rule)! With the efforts of Steven Grimm and Facebook, Memcached 1.2 and later have a multithreaded model. Multithreaded mode allows memcached to take full advantage of multiple CPUs and share all cached data between CPUs. Memcached uses a simple locking mechanism to guarantee mutual exclusion of data update operations. This is a more efficient way to handle multi gets than running multiple memcached instances on the same physical machine.

If your system load is not heavy, you may not need to enable multithreaded working mode. If you are running a large web site with large hardware, you will see the benefits of multithreading.

For more information, see: Http://code.sixapart.com/svn/memcached/trunk/server/doc/threads.txt.

Simply summarize: Command parsing (Memcached spends most of the time here) can run in multithreaded mode. memcached internal operations on data are based on a number of global locks (so this part of the work is not multi-threaded). Future improvements in multithreaded mode will remove a large number of global locks and improve the performance of memcached in highly loaded scenarios.

What is the maximum length of a key that memcached can accept?
The maximum length of a key is 250 characters. Note that 250 is a memcached server-side limitation, and if you use a client that supports "key prefixes" or similar features, the maximum length of key (prefix + original key) can be more than 250 characters. We recommend using shorter keys because you can save memory and bandwidth.

Memcached is there any limit on the expiration time of item?
The maximum expiration time can be up to 30 days. Memcached the passed-in expiration time (time period) is interpreted as a point in time, the memcached will put the item in a failed state once it has reached this point in time. This is a simple but obscure mechanism.

memcached how big a single item can be stored?
1MB. If your data is larger than 1MB, consider compressing or splitting the client into multiple keys.

Why is the size of a single item limited to 1M bytes?
Ah ... This is a question that you often ask!

Simple answer: Because the memory allocator algorithm is like this.

Detailed answer: memcached memory storage engine (engine will be pluggable ... ), use slabs to manage memory. Memory is divided into slabs chunks of unequal size (first divided into slabs of equal size, then each slab is divided into equal size chunks, slab of different chunk size is unequal). The size of the chunk starts with a minimum number, and grows by a factor until the maximum possible value is reached.

If the minimum value is 400B, the maximum value is 1MB, the factor is 1.20, the size of each slab chunk is: slab1-400b slab2-480b slab3-576b ...

The larger the chunk in the slab, the greater the gap between it and the front slab. Therefore, the larger the maximum value, the less memory utilization. Memcached must pre-allocate memory for each slab, so if you set a smaller factor and a larger maximum value, you will need more memory.

There are other reasons why you should not access large data in this way to memcached ... Don't try to put huge pages into the mencached. It takes a long time to load and unpack such a large data structure into memory, which results in poor performance on your website.

If you do need to store more than 1MB of data, you can modify the value of the Slabs.c:power_block and recompile the memcached, or use an inefficient malloc/free. Other recommendations include databases, MogileFS, and so on.

Can I use cache space of varying sizes on different memcached nodes? After doing this, will memcached be able to use memory more efficiently?
The Memcache client only determines on which node a key is stored based on the hashing algorithm, regardless of the memory size of the node. Therefore, you can use caches of varying sizes on different nodes. But this is generally done: multiple memcached instances can be run on nodes with more memory, and each instance uses the same memory as the instances on other nodes.

What is a binary protocol, should I pay attention?

The best information about binary is of course the binary protocol specification: Http://code.google.com/p/memcached/wiki/MemcacheBinaryProtocol.

The binary protocol attempts to provide a more efficient and reliable protocol for the end, reducing the CPU time generated by the client/server side due to processing protocols.
According to Facebook's tests, parsing the ASCII protocol is the most CPU-intensive part of memcached. So why don't we improve the ASCII protocol?

Some old information can be found in the thread of this mailing list: http://lists.danga.com/pipermail/memcached/2007-July/004636.html.

How does the memcached memory allocator work? Why not apply malloc/free!? Why use slabs?
In fact, this is a compile-time option. The internal slab allocator is used by default. You really should actually use the built-in slab allocator. At the earliest, memcached only used Malloc/free to manage memory. However, this approach does not work well with the memory management of the OS before. Repeated malloc/free caused memory fragmentation, and the OS eventually spent a lot of time looking for contiguous blocks of memory to meet malloc requests, rather than running the memcached process. If you do not agree, of course you can use malloc! Just don't complain in the mailing list:)

The slab dispenser was born to solve the problem. The memory is allocated and divided into chunks, which has been reused. Because memory is divided into slabs of different sizes, if the size of the item is not appropriate for the slab that is chosen to store it, some memory is wasted. Steven Grimm is already making effective improvements in this area.

There are some improvements to the slab in the mailing list (power of N or power of 2) and Tradeoff scenario: http://lists.danga.com/pipermail/memcached/2006-May/002163.html http ://lists.danga.com/pipermail/memcached/2007-march/003753.html.

If you want to use Malloc/free to see how they work, you can define USE_SYSTEM_MALLOC during the build process. This feature is not well tested, so it is too unlikely to be supported by developers.

More information: Http://code.sixapart.com/svn/memcached/trunk/server/doc/memory_management.txt.

Is the memcached atomic?
Of course! Well, let's make it clear:
All the individual commands that are sent to the memcached are completely atomic. If you send a set command and a GET command for the same data at the same time, they do not affect each other. They will be serialized and executed successively. Even in multithreaded mode, all commands are atomic unless the program has a bug:)
The command sequence is not atomic. If you get an item with a GET command, modify it, and then want to set it back to memcached, we don't guarantee that the item is not manipulated by another process (process, not necessarily an operating system). In the case of concurrency, you may also overwrite an item that is set by another process.

Memcached 1.2.5 and later, the Get and CAS commands are available, and they solve the problem above. If you use the GET command to query the item,memcached of a key, you will be returned with a unique identifier for the item's current value. If you overwrite this item and want to write it back to memcached, you can send the unique identity to memcached with the CAS command. If the item's unique identity in the memcached is consistent with what you provide, your write operation will succeed. If the item is also modified by another process during this time, the unique identity of the item stored in the memcached will change and your write will fail.

It is often tricky to modify item based on the value of item in memcached. Unless you know exactly what you're doing, don't do anything like that.

Memcache frequently asked questions and answers

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More