Memecached caching principle and basic operation, distributed (consistent hash)

Source: Internet
Author: User
Tags cas memcached ord mongodb support

Original address: http://lixiangfeng.com/blog/article/content/7869717 reprint please mark here, thank you!

What is a cache? Why use caching?

caching, which provides a dynamic, database-driven site speed by caching data and objects in memory to reduce the number of times the database is read.

What are the caching tools? Where is the difference?

Caching tools: memecached, Redis, MongoDB

Difference:

    1. Performance is relatively high: overall, TPS (total transaction volume per second) is similar to Redis and Memcache, more than MongoDB;
    2. Ease of Operation:

A) Memcache data structure single

b) Redis is rich, data manipulation, Redis better, less network IO times,

c) MongoDB support rich data expression, index, most similar relational database, support the query language is very rich;

    1. The size of the memory space and the size of the data volume:

A) Redis has added its own VM features after the 2.0 release, breaking the physical memory limit and setting the expiration time for key value (similar to memcache);

b) Memcache can modify the maximum available memory, using the LRU algorithm;

c) MongoDB suitable for large data volume storage, relying on operating system VMS do memory management, eat memory is also more severe, service and other services do not together;

    1. Reliability (Persistence):

A) Redis support (snapshot, AOF): dependent on the snapshot for persistence, AOF enhances the reliability while the performance has a impact;

b) Memcache not supported, usually used in cache, improve performance;

c) MongoDB has been using Binlog to support persistent reliability from version 1.8

    1. Data consistency (transactional support):

A) Memcache in concurrent scenarios, with CAs to ensure consistency;

b) Redis transaction support is weak and can only guarantee continuous execution of each operation in a transaction

c) MongoDB does not support transactions

    1. Data analysis: MongoDB has built-in data analysis capabilities (MapReduce), others do not support

Memecached detailed

the difference between memecached, memecached and Memecache:

    1. One of the first letter capitalized memcached, refers to the memcached server, is the independent running memcached background server, used to store data "database";
    2. and Memcached and Memcache refer to the client of memcached. One of the memcache is implemented independently of PHP, which belongs to the old version. And Memcached is based on the native C's libmemcached extension, more perfect, higher performance.

memecached Key Start setting parameters

    1. –m: The default allocated memory size is 2G for each process under 64m,32-bit operating system, so if you need to allocate more memory to use a 64-bit operating system, this will not be used as soon as necessary, it is gradually assigned to each slab. ;
    2. –I: Adjust the size of the assigned page page, default 1M, Minimum 1K, Max 128K;
    3. –f:memcached By default, the maximum value for the next slab is 1.25 times times the previous one, which can be set by this parameter;
    4. –P:TCP port setting, default is 11211;
    5. –l (lowercase L): The IP address of the listener, if the machine is not set;
    6. –d: Running in the daemon mode;
    7. –u: Specify the user;
    8. –M: Suppress LRU policy, return error when memory is exhausted, not delete item;
    9. –c: The maximum number of simultaneous connections, the default is 1024;
    10. –t: Number of threads, default is 4;

e.g./usr/bin/memcached-m 64-p 11212-u nobody-c 2048-f 1.1-i 1024-d-L 10.211.55.9

Memecached Memory allocation Policy

When the first time to store data to memcached, memcached will go to request 1MB of memory (this 1M of memory becomes page), and then divide the block memory into multiple slab, if you can store this data the best chunk size is 128B, Then memcached will have just applied for the slab in 128B units to split into 8192 pieces. When all the chunk of this page slab are exhausted, and the data need to be stored in the 128B chunk, if the memory requested is less than the maximum available memory 10MB, memcached continue to apply for 1 m memory, continue to partition in 128B and then storage; If it is not possible to continue to request memory, then mamcached will first use the LRU algorithm to release the longest unused chunk in the queue and then use that chunk for storage.

page is the smallest unit of memory allocation

The memory allocation for memcached is in page, by default a page is 1M, which can be specified at startup by the-I parameter. If you need to request memory, memcached divides a new page and assigns it to the desired slab area. Page is not recycled or redistributed once it is assigned to a restart

Slabs Dividing data space

Instead of storing the data directly in the page, memecached stores the data in a series of slab, each slab to data that is responsible for storing more than the previous slab while being less than or equal to its size, such as SLAB2 is only responsible for storing 105~136 Byte size data, each slab size is not the same, the default next slab maximum value is 1.25 times times the previous one (-f)

Chunk is the unit that stores the cached data

memecached divides each slab into a series of equal-sized storage spaces, which is called Chuck. Chunk is the smallest storage unit for memecached. At the same time, if the storage data size is smaller than the size of chunk, the free space will be idle, this is designed to prevent memory fragmentation. For example, the chunk size is 224byte, and the stored data is only 200byte, and the remaining 24byte will be idle.

To synthesize the above introduction, Memcached's memory allocation strategy is to allocate page according to slab requirements, each slab use chunk storage on demand.

Here are a few features to note: memcached assigned page will not be reclaimed or redistributed memcached requested memory will not be released slab idle chunk will not be lent to any other slab use

Memecached's static command

You can use the static command after you connect memecached using Telnet client

Command

Meaning description

Stats Slabs

Displays information about each slab, including the size, number, usage, etc. of the chunk

Stats items

Displays the number of item in each slab and the age of the oldest item (the number of seconds the last access distance is now)

Stats detail [On|off|dump]

Set or display detailed operation record;

The parameter is on to open the detailed operation record;

The parameter is off, close the detailed operation record;

The parameter is dump, showing the detailed operation record (number of times each key value get, set, hit, Del)

Stats malloc

Print memory allocation information

Stats sizes

Print Cache usage Information

Stats Reset

Resetting statistics

The distributed algorithm of memecached

When a memecached is unable to meet our needs, we need to configure multiple memecached servers, which is called memecached distribution, but the attendant problem is that assuming we have three servers A, B, C, we need to save a user name, Then there is server A, so how do we know where we are before we get to it? How to solve this problem becomes the focus of memecached distributed.

In general, we have two ways to achieve this:

    1. Hash fetch (take remainder):

A) Introduction: Suppose we have 3 servers, we will be stored in the Memecached key through the hashing algorithm to get an integer, and then the integer and 3 to touch, then regardless of the integer is how much, the result is necessarily one of the 0,1,2, then this can solve the problem.

b) Cons: When the number of our servers changes, relying on the results of the above algorithm will be different from the previous changes, resulting in the loss of data we have deposited cannot be obtained.

C) Code example:

<?PHP/** * Normal hash distribution*///hash FunctionFunctionmhash ($key){    $MD 5=substr(MD5($key), 0,8); $seed= 31; $hash= 0;  for($i= 0;$i<8;$i++){        $hash=$hash*$seed+Ord($MD 5{$i}); $i++; }    return$hash&0x7fffffff;}//Suppose there are 2 memcached servers$servers=Array(    Array(' host ' = ' 192.168.1.1 ', ' Port ' =>11211),Array(' host ' = ' 192.168.1.1 ', ' Port ' =>11211));$key= ' MyBlog ';$value= ' http://blog.phpha.com ';$SC=$servers[Mhash ($key)% 2];$memcached= Newmemcached ($SC);$memcached->set ($key,$value);?>
    1. Consistent hash distribution algorithm, consistent hash distribution algorithm is divided into 4 steps:

A) think of a 32-bit integer [0 ~ (2^32-1)] as a ring, 0 as the beginning, (2^32-1) as the end, of course, this is just imagination.

b) The key is processed into integers by the hash function. This allows you to find a location on the ring that corresponds to it.

c) Map the memcached server farm to the ring, and use the hash function to process the server's corresponding IP address.

d) Map the data to the memcached server. Here's how to find the memcached server location for a key: from the position of the current key, go clockwise along the circle, find the nearest memcached server, and save the data for the key to this server.

e) Code example:

<?PHP/** * Consistent hash distribution*/classflexihash{//Server List    Private$serverList=Array(); //whether the record has been sorted    Private$isSorted=FALSE; //Add a serverPublicfunctionaddserver ($server){        $hash=$this->mhash ($server); if(!isset($this->serverlist[$hash])){            $this->serverlist[$hash] =$server; }        //need to re-order        $this->issorted =FALSE;    Returntrue; }    //Removing a serverPublicfunctionremoveserver ($server){        $hash=$this->mhash ($server); if(isset($this->serverlist[$hash])){            unset($this->serverlist[$hash]); }        //need to re-order        $this->issorted =FALSE;    Returntrue; }    //find the right server in the current server listPublicfunctionlookup ($key){        $hash=$this->mhash ($key); //Reverse sort operation First        if(!$this-issorted) {            Krsort($this->serverlist,sort_numeric); $this->issorted =TRUE; }        //find a server near the current key in a clockwise direction on the ring        foreach($this->serverlist as$pos=$server){            if($hash>=$pos)return$server; }        //the last server is returned in a clockwise direction without being found        return$this->serverlist[Count($this->serverlist)-1]; }    //hash FunctionPrivatefunctionmhash ($key){        $MD 5=substr(MD5($key), 0,8); $seed= 31; $hash= 0;  for($i= 0;$i<8;$i++){            $hash=$hash*$seed+Ord($MD 5{$i}); $i++; }        return$hash&0x7fffffff; }}?>

Description: In this way, when a server is added or removed, the range of affected data becomes smaller.

The use of memecached

    1. The difference between add, set, and replace:

A) Add for adding a data to be cached;

b) Set for setting the content of a specified key, which is a collection of add and replace;

c) Replace user replaces the contents of a specified key and returns False if key does not exist;

Method

When key exists

When key does not exist

Add

False

True

Replace

Replace (true)

False

Set

Replace (true)

True

    1. Flush is used to clear all cached data;
    2. CAS is able to write values only after the current client has the last value, and the value corresponding to the key is not modified by another client
    3. Increment and decrement:

A) increment, the value of the element +1, if the value of the element is not a number, press 0 to process;

b) Decrement, the value of the element-1, if the value of the element is not an array by 0 processing;

c) The memecached queue can be realized through increment and decrement;

    1. Append and prepend:

A) Append append content to the element;

b) prepend append the content to the front of the element;

Memecached caching principle and basic operation, distributed (consistent hash)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.