Into the world of caches (iii)-Memcache

Source: Internet
Author: User

Series Articles
    • Into the world of caches (i)-opening
    • Into the world of Caches (ii)-Cache design
    • Into the world of caches (iii)-Memcache

Brief introduction

Memcache is a high-performance distributed memory object caching system for dynamic Web applications to mitigate database load. It is based on a hashmap that stores key/value pairs, increasing the speed of dynamic, database-driven Web sites by caching data and objects in memory to reduce the number of times the database is read.

It works by opening up a space in memory, then building a hashtable and managing it yourself, using non-blocking network IO.

For more detailed information see Memcache official website: http://www.danga.com/memcached

Memcache Implementation principle

Memcache data is stored in memory, which means the following issues need to be considered:

    • Faster access than traditional relational databases because the file IO operation is slow in order to persist the traditional relational database
    • Data is easily lost, and all data is lost once the outage or reboot
    • Limited by the number of machines, 32-bit machines can use up to 2GB of memory, and 64-bit machines may think there is no upper limit

For the cache database, the most important thing is the memory allocation, memcache the memory allocation method is fixed space allocation.

This picture involves the slab_class, slab, page, chunk four concepts, the relationship between them is:

    1. Memcache divides the memory space into a set of slab
    2. Each slab a number of page, each page by default is 1M, if a slab occupy 100M memory, then this slab should have 100 page
    3. Each page contains a set of Chunk,chunk where the data is actually stored, and the size of the chunk inside the same slab is fixed.
    4. Slab with the same size chunk are organized together, called Slab_class

The Memcache memory allocation is limited by the number of Allocator,slab, several, more than 10, or dozens of, which is related to the configuration of the boot parameter.

Where value in Memcache is stored is determined by the size of value, and value is always stored in the closest slab to the chunk size. For example Slab[1] chunk size is 80 bytes, slab[2] Chunk size is 100 bytes, slab[3] The size of the chunk is 128 bytes (the chunk in the adjacent slab grows at a rate of 1.25, which can be specified with-F at the start of the memcache), then a 88-byte value will be placed in the 2nd number slab.

When put into slab, first request memory for slab, request memory is in page, so when the first data is placed, regardless of size, there will be a 1M size page is assigned to the slab. After applying to page, slab will slice the page's memory by chunk size, so it becomes an chunk array, and finally select one of the chunk array to store the data.

If there is no chunk in this slab can be assigned what to do, if memcache boot does not append-m (Prohibit LRU, in this case memory is not enough will be reported out of the errors), Then Memcache will clean up the data from the least recently used chunk in this slab, and then put up the latest data. For memcache memory allocation and recovery algorithm, summarize three points:

    1. Memcache memory allocation Chunk There will be a waste of memory, 88 bytes of value allocation in the 128-byte chunk loss of 30 bytes, but this also avoids the problem of managing memory fragmentation
    2. Memcache's LRU algorithm is not for the global, is aimed at slab, that is, the elimination of assessment in Slab
    3. The value size of memcache storage is limited, because a new data comes in, slab will first request a piece of memory in page, the requested memory is only 1 m, so the value size naturally cannot be greater than 1 m
Features and limitations of memcache

Some of the above have been mentioned, here again to summarize:

    1. There is no limit to the amount of item that can be saved in memcache, as long as memory is sufficient
    2. Memcache the maximum use memory for a single process in a 32-bit machine is 2g,64-bit machine there is no limit
    3. Key up to 250 bytes, beyond which the length cannot be stored
    4. Single item maximum data is 1MB, data exceeding 1MB is not stored
    5. The data on the Memcache server is insecure, such as a known memcache record, which can be directly telnet past and flush_all the existing key/value to an immediate failure.
    6. Unable to traverse all item in Memcache because the operation is relatively slow and will block other operations
    7. Memcache's high performance stems from two hash structures: the first stage at the client, the client calculates a server node by the hash algorithm based on the key value, and the second stage, on the service side, finds the real item and returns it to the client through an internal hash algorithm.
    8. From the implementation point of view, Memcache is a non-blocking, event-based server program

Memcached distributed to each other: the Memcached server does not communicate, and the data is stored by the client's distributed algorithm to each server.

Memcache is called "distributed Cache", but Memcache itself does not have a distributed function, memcache clusters do not communicate with each other (in contrast, such as JBoss cache, a server has cached data updates, Notifies other machines in the cluster to update the cache or clear the cached data), the so-called "distributed", is completely dependent on the implementation of the client program, as in the above diagram of the process.

Memcache the process of writing a cache once:

    1. Application input data that requires write cache
    2. API will key input routing algorithm module, routing algorithm based on key and memcache Cluster Server list to get a server number
    3. Get memcache and its IP address and port number by server number
    4. API calls Communication module and specified number of server communication, writes data to the server, completes a distributed cache write operation

Read cache and write cache, as long as the same routing algorithm and server list, and the application query is the same Key,memcache client always access the same client to read data, as long as the server also cache the data, can guarantee the cache hit.

This way of Memcache cluster is also from the aspect of partition fault tolerance, if Node2 down, then Node2 stored on the data are not available, at this time because the cluster Node0 and Node1 still exist, the next request Node2 stored in the key value, Must be no hit, then first get the data to be cached from the database, and then the routing algorithm module according to the key value in Node0 and Node1 Select a node, the corresponding data into the next time you can go cache, this cluster approach is very good, but the disadvantage is the cost is relatively large.

It is also important to note that the Memcache setting adds a key value, the specified expiry of 0 indicates that the Key/value is permanent, but this key/value will expire after 30 days, see MEMCACHE.C Source code:

#defineRealtime_maxdelta 60*60*24*30Staticrel_time_t Realtime (Consttime_t Exptime) {       if(Exptime = =0)return 0; if(Exptime >Realtime_maxdelta) {                                     if(Exptime <=process_started)return(rel_time_t)1; return(rel_time_t) (Exptime-process_started); } Else {                                                                                return(rel_time_t) (Exptime +current_time); }}
View Code

This time of failure is memcache source code inside the dead, developers have no way to change.

TTL Expiration policy

Data expiration mode: Lazy expiration + LRU. This is similar to Redis and should be done instead of maintaining the red and black trees.

1. Lazy Expiration

Memcached internally does not monitor whether the record is out of date, but instead looks at the timestamp of the record at get and checks whether the record is out of date. This technique is called lazy (lazy) expiration. As a result, memcached does not consume CPU time on outdated monitoring.

2. LRU

Memcached takes precedence over the space of a record that has timed out, but even so, there is a lack of space when appending a new record, and a space is allocated using the Least recently Used (LRU) mechanism. When memcached has insufficient memory space (when the new space cannot be obtained from the Slab Class), it searches for records that have not been used recently and allocates their space to new records.

Memcache's Safety

The above Memcache server is directly through the client connection operation, there is no verification process, if the server is directly exposed to the Internet, it is more dangerous, light data leakage by other unrelated personnel view, heavy server was compromised, because Mecache is running with root privileges, In addition, there may be some of our unknown bugs or buffer overflow situation, these are unknown to us, so the danger is predictable. For the sake of safety, I do two suggestions, can be a little to prevent hacking or data leakage.

Intranet access

It is best to put the access between the two servers in the intranet form, usually between the Web server and the Memcache server. Universal servers are two network card, a point to the Internet, a point to the intranet, then let the Web server through the intranet network card to access the Memcache server, we memcache on the server when the boot on the network to listen to the IP address and port, Access between the intranet can effectively block other illegal access.

1024x768 192.168. 0.200 11211 1024x768 -p/tmp/memcached.pid

Memcache server-side setup listens on 11211 ports of 192.168.0.200 IP over the intranet, consumes 1024MB of memory, and allows up to 1024 concurrent connections

Set up a firewall

Firewall is a simple and effective way, if the two servers are hanging in the network, and need to access the Memcache through the extranet IP, then you can consider using a firewall or agent to filter illegal access.
Generally we can use iptables or FreeBSD under the Linux under the IPFW to specify some rules to prevent some illegal access, such as we can set only allow our Web server to access our Memcache server, while blocking other access.

# iptables--192.168. 0.2 11211 -192.168. 0.2 11211 -j ACCEPT

The iptables rule above is to allow only 192.168.0.2 this Web server to the Memcache server access, can effectively prevent some illegal access, the corresponding can also add some other rules to enhance security, which can be done according to their own needs.

Into the world of caches (iii)-Memcache

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.