Memcached distributed cache, memcached

Source: Internet
Author: User
Tags crc32 key string

Memcached distributed cache, memcached
1. Introduction to memcached distributed architecture

Although memcached is called a "distributed" cache server, the server does not have a "distributed" function. Memcache cluster hosts cannot communicate with each other to transmit data. Its "distributed" feature is further implemented based on the client's program logic algorithm.

See the following figure:

Based on our brief analysis of the distributed memcached set and get Processes

Set process:

1. First, use the application set ('key', 'value ')

2. Enter the program and use the key to obtain the node location for the key to be stored through the logical algorithm.

3. Connect to the corresponding memcached Server Based on the node location and send the set command

Get process:

1. First, get ('key') through the Application ')

2. Use the key to obtain the storage node of the key through a logical algorithm.

3. Connect to the corresponding memcached Server Based on the node and send the get command

There are many ways to implement memcached. The most common method is the distribution of consistent hash ideas (referred to as consistent hash distribution ). Good things certainly need inferior products to set off their advantages. Therefore, in addition to explaining consistent hash distribution, we will also talk about distributed modulo operations. To further analyze their advantages and disadvantages.

All examples here are implemented using PHP code. Of course, the most important thing is thought and method! After all, these two things are connected in any language.

 

2. modulo Algorithm

What is the mode of Modulo algorithm distributed? It is to convert the key to a 32-bit number and divide it with the total number of memcached servers to get the remainder. The remainder is the node of the memcached server. With this node, we can determine the memcached server and then send the command to memcached for execution.

Graph Analysis:

The entire process is shown.

1) PHP code implementation

GetModMemcache. class. php

1 <? Php 2 # distributed memcache (Modulo computing) 3 class GetModMemcache 4 {5 private $ total = ''; # store the total number of memcache servers 6 private $ servers = array (); # store the specific information of the memcache Server 7/** 8 * @ desc constructor 9*10 * @ param $ serversArr array | specific information of the memcache Server 11 */12 public function _ construct ($ serversArr) 13 {14 $ this-> total = count ($ serversArr); 15 $ this-> servers = $ serversArr; 16} 17 18/** 19 * @ desc calculate the storage location of $ key (that is, the server) 20*21 * @ param string | key character String 22*23 * @ return int returns the 24 */25 protected function position ($ key) 26 {27 # Use crc32 (), convert the string to a 32-bit numeric 28 return sprintf ('% U', crc32 ($ key) % $ this-> total; # Get the remaining 29} 30 31/** 32 * @ desc get the memcached object 33*34 * @ param $ position int | key location information 35*36 * @ return object return instantiation memcached object 37 */38 protected function getMemcached ($ position) 39 {40 $ host = $ this-> servers [$ position] ['host']; # host41 $ port = $ this-> Servers [$ position] ['Port']; # port42 $ m = new memcached (); 43 $ m-> addserver ($ host, $ port); 44 return $ m; 45} 46 47/** 48 * @ desc set the key-value to 49*50 * @ param string | key string 51 * @ param mixed | the value can be any valid non-resource type. php type 52*53 * @ return result 54 */55 public function setKey ($ key, $ value) 56 {57 $ num = $ this-> position ($ key); 58 echo $ num; # debug with 59 $ m = $ this-> getMemcached ($ num); # Get the memcached object 60 return $ m-> set ($ Key, $ value); 61} 62 63 public function getKey ($ key) 64 {65 $ num = $ this-> position ($ key ); 66 $ m = $ this-> getMemcached ($ num); 67 return $ m-> get ($ key ); 68} 69 70 71} 72 73 74 $ arr = array (75 array ('host' => '2017. 168.95.11 ', 'Port' => '123456'), 76 array ('host' => '123456. 168.95.11 ', 'Port' => '123'), 77 array ('host' => '123. 168.95.11 ', 'Port' => '123'), 78); 79 $ mod = new GetModMemcache ($ arr ); 80 81/* 82 # store data 83 $ a = $ mod-> setKey ('Key3', 'key33333 '); 84 echo "<pre>"; 85 print_r ($ a); 86 echo "</pre>"; die; 87 */88/* 89 # obtain data 90 $ B = $ mod-> getKey ('key1'); 91 echo "<pre>"; 92 print_r ($ B ); 93 echo "</pre>"; die; 94 */95?>

 

2) conduct corresponding tests

1. Insert three data records consecutively

# Set ('key1', 'value1111111'); # node = 1

# Set ('key2', 'value22222 '); # node = 1

# Set ('key3', 'value33333 ';) # node = 0

2. telnet to connect 192.168.95.11: (11210, 11211, 11212)

11210 containing key3 data

11211 contains key1 and key2 data

11212 data not included

3. Use the program get data

All results can be retrieved.

3) advantages and disadvantages

Advantages:

1. Simple, practical, and easy to understand

2. Even Data Distribution

Disadvantages:

1. When a memcached server is down, the group cannot be automatically adjusted to process data, so that some data cannot be cached and data is continuously retrieved from the database.

2. If multiple memcached servers need to be added when resizing is required, most of the previously cached data cannot be hit, that is, the data is useless.

 

3. Consistent Hash Algorithm

What is the consistent hash algorithm distributed?

Imagine that all 32-bit numbers are distributed clockwise in a ring from small to large;

Second, assign a name to each storage node and convert it to a 32-bit number through the crc32 function. This number is the storage node of the memcached server.

Then, convert the key to a 32-bit number through the crc32 function, its location is clockwise. The memcached server corresponding to the first storage node is the final storage server of the key.

1) image resolution

Assume that the node 1 server is down, and the data originally stored in node 1 can also be stored in node node3 according to the clockwise closest principle.

Suppose there is a need for expansion. What will happen to the two added memcached servers? See Analysis

The results show that only a small amount of data is affected, and these effects are acceptable to the overall data.

From the figure above, we can easily find that the crc32 function cannot control the specific location of the memcached storage node, in addition, the total number of nodes is so small as the power of 2 to the power of 32. Even if these storage nodes are very close, a memcached server will surely bear the vast majority of data caches.

See the analysis:

Solution:

Map a real storage node to multiple virtual storage nodes, that is, the real node + suffix is processed through crc32 (for example, node1_1, node1_2, node1_3 ,..... , Node1_n)

View Node Distribution:

The three real nodes are converted into thirty storage nodes on the ring, which can avoid the problem of uneven data cache distribution caused by too close storage nodes, the storage mechanism has not changed.

2) PHP code implementation

ConsistentHashMemcache. class. php

1 <? Php 2 # distributed memcache consistent hash algorithm (using a ring data structure) 3 class ConsistentHashMemcache 4 {5 private $ virtualNode = ''; # used to store the number of virtual nodes 6 private $ realNode = array (); # used to store real nodes 7 private $ servers = array (); # used to store memcache server information 8 # private $ totalNode = array (); # Total number of nodes 9/** 10 * @ desc constructor 11*12 * @ param $ servers array | memcache server information 13 * @ param $ virtualNode int | Number of virtual nodes, default 64 14 */15 public function _ construct ($ servers, $ virt UalNode = 64) 16 {17 $ this-> servers = $ servers; 18 $ this-> realNode = array_keys ($ servers); 19 $ this-> virtualNode = $ virtualNode; 20} 21 22/** 23 * @ return int returns a 32-bit number 24 */25 private function hash ($ str) 26 {27 return sprintf ('% U ', crc32 ($ str )); # convert a string to a 32-bit number 28} 29 30/** 31 * @ desc processing node 32*33 * @ param $ realNode array | real node 34 * @ param $ paiuralnode int | Number of virtual nodes 35*36 * @ return array returns information about all nodes. 37 */38 private function dealNode ($ realNode, $ virtualNode) 39 {40 $ totalNode = array (); 41 foreach ($ realNode as $ v) 42 {43 for ($ I = 0; $ I <$ virtualNode; $ I ++) 44 {45 $ hashNode = $ this-> hash ($ v. '-'. $ I); 46 $ totalNode [$ hashNode] = $ v; 47} 48} 49 ksort ($ totalNode); # sort by index, 50 return $ totalNode in ascending order; 51} 52 53/** 54 * @ desc obtain the real storage node of the key 55*56 * @ param $ key string | key string 57*58 * @ return string returns Real node 59 */60 private function getNode ($ key) 61 {62 $ totalNode = $ this-> dealNode ($ this-> realNode, $ this-> virtualNode ); # retrieve all virtual nodes 63/* # view the total number of virtual nodes 64 echo "<pre>"; 65 print_r ($ totalNode); 66 echo "</pre>"; die; 67 */68 $ hashNode = $ this-> hash ($ key); # key hash node 69 foreach ($ totalNode as $ k => $ v) # recyclically sum up the vertex ring to find 70 {71 if ($ k >=$ hashNode) # Find the first hash node with a value greater than the key 72 {73 return $ v; # return real node 74} 75} 76 return reset ($ TotalNode); # If the total node ring value is smaller than the key hash node value, return the value of the first hash ring 77} 78 79/** 80 * @ desc return memcached object 81*82 * @ param $ key string | key value 83*84 * @ return object 85 */86 private function getMemcached ($ key) 87 {88 $ node = $ this-> getNode ($ key); # obtain the real node 89 echo $ key. 'actual node :'. $ node. '<br/>'; # test usage. Check the real node 90 $ host = $ this-> servers [$ node] ['host'] of the key. # host 91 $ port = $ this-> servers [$ node] ['Port']; # a service in the server pool Port 92 $ m = new memcached (); # instantiate 93 $ m-> addserver ($ host, $ port); # Add a memcache server 94 return $ m; # Return memcached object 95} 96 97/** 98 * @ desc set key-value 99 */100 public function setKey ($ key, $ value) 101 {102 $ m = $ this-> getMemcached ($ key); 103 return $ m-> set ($ key, $ value ); 104} 105 106/** 107 * @ desc get value108 */109 public function getKey ($ key) in the key) 110 {111 $ m = $ this-> getMemcached ($ key); 112 return $ m-> get ($ Key); 113} 114 115 116} 117 118?>

3) test

1. view all virtual nodes

A total of 64*3 = 132 virtual nodes (the virtual node settings are still relatively low, usually in 100 ~ 200)

2. set Test

1 include '. /ConsistentHashMemcache. class. php '; 2 header ("content-type: text/html; charset = utf8 ;"); 3 $ arr = array (4 'node1' => array ('host' => '2017. 168.95.11 ', 'Port' => '123'), 5 'node2' => array ('host' => '123. 168.95.11 ', 'Port' => '20180101'), 6 'node3' => array ('host' => '20180101. 168.95.11 ', 'Port' => '123'), 7); 8 9 $ c = new ConsistentHashMemcache ($ arr ); 10 11 # test set12 $ c-> setKey ('aaa', '000000'); 13 $ c-> setKey ('bbb ', '123 '); 14 $ c-> setKey ('ccc ', '123 ');

 

Telnet respectively to connect 192.168.95.11 :( 11210, 11211, 11212)

Get ('aaa') and get ('bbb') can be obtained in node node1.

Get ('ccc ') in node node3 can get the value

3. get Test

1 include '. /ConsistentHashMemcache. class. php '; 2 header ("content-type: text/html; charset = utf8 ;"); 3 $ arr = array (4 'node1' => array ('host' => '2017. 168.95.11 ', 'Port' => '123'), 5 'node2' => array ('host' => '123. 168.95.11 ', 'Port' => '20180101'), 6 'node3' => array ('host' => '20180101. 168.95.11 ', 'Port' => '123'), 7); 8 9 $ c = new ConsistentHashMemcache ($ arr ); 10 # test get11 echo $ c-> getKey ('aaa '). '<br/>'; 12 echo $ c-> getKey ('bbb '). '<br/>'; 13 echo $ c-> getKey ('ccc '). '<br/> ';

4. Advantages and Disadvantages

Compared with the distributed modulo mode, the Code complexity of the distributed consistent hash mode is much higher, but this is acceptable and does not constitute any obstacles. On the contrary, it has obvious advantages. Through virtual nodes, uncontrollable storage nodes can be evenly distributed on the ring as much as possible, in this way, data is evenly cached on each host. The addition and deletion of virtual nodes have little impact on the overall data cached previously.

 

(The above are some of your own opinions and conclusions. If you have any shortcomings or errors, please point them out)

Author: The leaf goes with the wind

Statement: The above only represents the point of view or conclusion I have summarized at a certain time in my work and study. When reprinting, please provide the original article link clearly on the Article Page

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.