memcached working principle and common problems

Source: Internet
Author: User

How does memcached work?

The magic of memcached comes from a two-stage hash (two-stage hash). Memcached is like a huge hash table that stores a lot of <key,value> pairs. With key, you can store or query arbitrary data.

The client can store the data on more than one memcached. When querying the data, the client first calculates the hash value of the key (phase a hash), and then selects a node, the client sends the request to the selected node, and then the Memcached node passes an internal hash algorithm (phase two hash). Find the real data (item).

  As an example, suppose there are 3 clients 1, 2, 3, 3 memcached A, B, c:client 1 want to store the data "Barbaz" with Key "Foo". Client 1 first refers to the node list (A, B, C) and calculates the hash value of key "foo", assuming memcached B is selected. Then, Client 1 directly connect to memcached B, and the data "Barbaz" is stored through key "foo". Client 2 uses the same clients library (which means phase one is the same as the hashing algorithm) and also has the same memcached list (A, B, C). So, after the same hash calculation (phase one), Client 2 calculates the key "foo" on memcached B, and then it requests memcached B directly to get the data "Barbaz". The data stored in the memcached is different for various clients (Perl storable, PHP serialize, java hibernate, JSON, etc.). Some client implementations do not have the same hashing algorithm . However, the memcached server-side behavior is always consistent.

Finally, from an implementation point of view, Memcached is a non-blocking, event-based server program. This architecture is a good solution to c10k problem, and has excellent scalability.

Memcached How to implement redundancy mechanism?

Not implemented! We were amazed at the problem. Memcached should be the cache layer of the application. Its design itself has no redundancy mechanism. If a memcached node loses all of its data, you should be able to retrieve it from the data source (such as the database ) again. You should be particularly aware that your app should tolerate node failures. Do not write some bad query code, hope to memcached to guarantee everything! If you are concerned about the failure of the node to greatly increase the burden on the database, you can take some measures. For example, you can add more nodes (to reduce the impact of losing one node), Hot spare nodes (take over IP when other nodes are down), and so on.

Memcached How to handle fault-tolerant?

No Deal! :) In the case of memcached node failure, there is no need for the cluster to do any fault-tolerant processing. If a node fails, the measures to be taken depend entirely on the user. When a node fails, here are a few scenarios to choose from:

* Ignore it! There are many other nodes that can deal with the effect of node failure before the failed node is restored or replaced.

* Remove the failed node from the list of nodes. Be careful with this operation! By default (the remainder hash algorithm), the client adds or removes nodes, causing all cached data to be unavailable! Because the list of nodes for the hash reference changes, most of the keys are mapped to different nodes (as they were) because of the change in the hash value.

* Start the hot standby node and take over the IP occupied by the failed node. This prevents hash disturbances (hashing chaos).

* If you want to add and remove nodes without affecting the original hash results, you can use the consistent hashing algorithm (consistent hashing). You can baidu a consistent hashing algorithm. Clients that support consistent hashing are already mature and widely used. Go and try it!

* Two hashes (reshing). When the client accesses the data, if a node is found to be down, the hash is done again (the hash algorithm differs from the previous one), and the other node is re-selected (note that the client does not remove the down node from the node list and the next time it is possible to hash to it). If a node is good and bad, the two-hash method is risky, and dirty data may be present on both good and bad nodes.

How do I export memcached item in bulk?
You should not do this! The memcached is a non-blocking server . Any operation that could lead to a memcached pause or momentary denial of service should be worth pondering. Bulk importing data to memcached is often not what you really want! Imagine, if the cached data changes between export imports, you need to deal with dirty data, and if the cached data expires between export imports, what do you do with the data?

Therefore, exporting imported data in batches is not as useful as you might think. But it's very useful in a scene. If you have a large amount of data that is never changed and you want the cache to be hot (warm) quickly, it is helpful to bulk import the cached data. Although this scenario is not typical, it often happens, so we will consider the ability to implement bulk export imports in the future.

Steven Grimm, as always, gives another good example in the mailing list:.

But I do need to memcached the item in bulk export import, how to do??

All right, all right. If you need to export the import in bulk, the most likely cause is that it takes a long time to regenerate the cached data, or the database is bad for you to suffer.

If a memcached node is down to make you miserable, you'll get into a lot of other problems. Your system is too fragile. You need to do some optimization work. such as dealing with "surprise group" problem (such as memcached node is not valid, repeated queries to keep your database overwhelmed ...) This question is referred to in the other FAQ), or to optimize bad queries. Remember, Memcached is not an excuse to avoid optimizing your queries.

If your problem is simply to regenerate the cached data for a long time (15 seconds to more than 5 minutes), you might consider re-using the database . Here are a few tips:

* Use MogileFS (or similar software such as COUCHDB) to store the item. Calculate the item and dump it on the disk. MogileFS can easily overwrite item and provide quick access: You can even cache the item in the MogileFS in memcached, which speeds up reading. The combination of mogilefs+memcached can speed up the response time of cache misses and improve the usability of the website. * Re-use MySQL. MySQL's InnoDB primary key query is fast. If most of the cached data can be placed in a varchar field, the performance of the primary key query will be better. Querying by key from memcached is almost equivalent to MySQL's primary key query: Hashes the key to the 64-bit integer and stores the data in MySQL. You can store the original (not hash) key in a normal field, and then build a two-level index to speed up the query ... key passively fails, bulk delete invalid key, and so on.

All of the above methods can be introduced into memcached, which still provides good performance when restarting memcached. Because you don't need to be wary of the "hot" item being suddenly phased out by the memcached LRU algorithm , users no longer have to wait a few minutes for the cached data to regenerate (when the cached data suddenly disappears from memory), so the above method can improve overall performance.

How does the memcached authenticate?

No identity authentication mechanism! Memcached is software that runs on the lower layer of the application (authentication should be the upper-level responsibility of the application). The memcached client and server side are lightweight, in part because the authentication mechanism is not implemented at all. In this way, memcached can quickly create new connections without any configuration on the server side.

If you want to restrict access, you can use a firewall, or let memcached listen to UNIX domain sockets.

What are the multithreading of memcached? How do I use them?

The thread is the law (threads rule)! With the efforts of Steven Grimm and Facebook, Memcached 1.2 and later have a multithreaded model. Multithreaded mode allows memcached to take full advantage of multiple CPUs and share all cached data between CPUs. Memcached uses a simple locking mechanism to guarantee mutual exclusion of data update operations. This is a more efficient way to handle multi gets than running multiple memcached instances on the same physical machine.

If your system load is not heavy, you may not need to enable multithreaded working mode. If you are running a large web site with large hardware, you will see the benefits of multithreading.

Excerpt Address: http://www.educity.cn/net/1620395.html

Operation and maintenance See: Http://www.rootop.org/pages/category/memcached

Software compatible with memcached

(1), repcached

-Patch that provides replication (replication) functionality for memcached

-Single master single slave, mutual primary Auxiliary

(2), flared

-Store to QDBM. Features such as asynchronous replication and fail over are implemented.

(3), memagent

-Connect multiple MEMD, achieve consistent hash, request forward

(4), Memcachedb

-Store to Berkleydb

memcached working principle and common problems

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.