Memcached for Perl

Last Update:2018-12-03 Source: Internet

Author: User

Tags crc32 rehash perl script

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Use memcached to write down

Data stored in memcached is stored in the memory storage space built in memcached.
Because the data only exists in the memory, restarting memcached and the operating system will cause all data to disappear.
In addition, when the content capacity reaches the specified value, the unused cache is automatically deleted based on the LRU (least recently used) algorithm.
Memcached itself is a server designed for caching, so it does not take the permanent data into consideration.

Memcached is a "distributed" cache server, but the server does not have distributed functions.
Memcached does not communicate with each other to share information. Distributed depends on the implementation of the client.

Memcached requires the libevent library # Yum install libevent-devel

$ wget http://www.danga.com/memcached/dist/memcached-1.×.×.tar.gz
$ tar zxf memcached-1.×.×.tar.gz
$ cd memcached-1.×.×$ ./configure
$ make
$ sudo make install

#/usr/local/bin/memcached -p 11211 -m 64m -d

Option	Description
-P	The TCP port used. The default value is 11211.
-M	Maximum memory size. The default value is 64 MB.
-VV	Start in very vrebose mode, and output debugging information and errors to the console.
-D	Start daemon in the background

Memcached client API
: Http://www.danga.com/memcached/apis.bml

The memcached client of Perl has

Cache: memcached
Cache: memcached: fast
Cache: memcached: libmemcached

And other CPAN modules.

Http://search.cpan.org/dist/Cache-Memcached/

#!/usr/bin/perl

use strict;
use warnings;
use Cache::Memcached;

my $key = "foo";
my $value = "bar";
my $expires = 3600; # 1 hour
my $memcached = Cache::Memcached->new({
    servers => ["127.0.0.1:11211"],
    compress_threshold => 10_000
});

$memcached->add($key, $value, $expires);
my $ret = $memcached->get($key);
print "$ret/n";

Here, the IP address of the memcached server and an option are specified for Cache: memcached to generate an instance.
Cache: Common memcached options are as follows.

Option	Description
Servers	Specify the memcached server and port with Arrays
Compress_threshold	Value used for Data Compression
Namespace	Specify the prefix to add to the key

In addition, cache: memcached can serialize complex Perl data through the Storable module and then save it,
Therefore, hashes, arrays, and objects can be directly stored in memcached.

Save data

The methods for saving data to memcached are as follows:

Add
Replace
Set
They are used in the same way:

My $ add = $ memcached-> Add ('key', 'value', 'deadline ');
My $ replace = $ memcached-> Replace ('key', 'value', 'duration ');
My $ set = $ memcached-> set ('key', 'value', 'deadline ');

You can specify the period (in seconds) when saving data to memcached ). If the period is not specified, memcached saves data according to the LRU algorithm.
The differences between the three methods are as follows:

Option	Description
Add	It is saved only when no data with the same key exists in the bucket.
Replace	It is saved only when data with the same key exists in the bucket.
Set	Unlike add and replace

Get Data

You can use the get and get_multi methods to obtain data.

My $ val = $ memcached-> get ('key ');
My $ val = $ memcached-> get_multi ('key 1', 'key 2', 'key 3', 'key 4', 'key 5 ');

Use get_multi to retrieve multiple data records at a time. Get_multi can obtain multiple key values synchronously,
The speed is dozens of times faster than the loop call get.

Delete data

The delete method is used to delete data, but it has a unique function.

$ Memcached-> Delete ('key', 'blocking time (seconds )');

Deletes the data of the key specified by the first parameter. The second parameter specifies a time value. You cannot use the same key to save new data.
This function can be used to prevent incomplete cached data. Note that,The Set function ignores this blocking and saves data as usual.

Add and subtract operations

You can use a specific key value on memcached as a counter.

My $ ret = $ memcached-> incr ('key ');
$ Memcached-> Add ('key', 0) Unless defined $ ret;

Increment and subtract 1 are atomic operations, but when the initial value is not set, it is not automatically assigned to 0. Therefore,
Errors should be checked and initialization should be performed if necessary.

Cache: memcached distributed Method

Perl's memcached client function library cache: memcached is
The work of Brad Fitzpatrick, creator of memcached, can be said to be the original function library.

Cache: memcached-search.cpan.org

This function library implements distributed functions and is a standard distributed method for memcached.

Scattered Based on remainder Calculation

Cache: The distributed method of memcached is simply to say, "distribution based on the remainder of the number of servers ".
Calculate the integer Hash Value of the key, divide it by the number of servers, and select the server based on the remaining number.

The cache: memcached is simplified to the following Perl script.

Use strict;
Use warnings;
Use string: CRC32;

My @ nodes = ('node1', 'node2', 'node3 ');
My @ keys = ('Tokyo ', 'kanagawa', 'kiba ', 'saitama', 'gunm ');

Foreach my $ key (@ keys ){
My $ CRC = CRC32 ($ key); # CRC timeout
My $ mod = $ CRC % ($ # nodes + 1 );
My $ Server = $ nodes [$ mod]; # select a server based on the remainder
Printf "% s => % s/n", $ key, $ server;
}

Cache: memcached uses CRC when calculating the hash value.

String: CRC32-search.cpan.org

First, obtain the CRC value of the string. The server is determined by dividing the CRC value by the remainder of the number of server nodes.
After the above code is executed, enter the following results:

tokyo       => node2
kanagawa => node3
chiba       => node2
saitama   => node1
gunma     => node1

According to this result, "Tokyo" is distributed to node2, and "Kanagawa" is distributed to node3.
When the selected server cannot be connected, cache: memcached will set the number of connections
After the key is added, calculate the hash value again and try to connect. This action is called rehash.
If you do not want rehash, you can specify the "rehash => 0" option when generating the cache: memcached object.

Disadvantages of scattered calculation based on Remainder

The remainder calculation method is simple and data dispersion is excellent, but it also has its disadvantages.
That is, when a server is added or removed, the cost of cache reorganization is huge.
After a server is added, the remainder will change dramatically, So that you cannot obtain the same server as the one you saved,
This affects the cache hit rate. Use Perl to write code segments to verify the cost.

use strict;
use warnings;
use String::CRC32;

my @nodes = @ARGV;
my @keys = ('a'..'z');
my %nodes;

foreach my $key ( @keys ) {
    my $hash = crc32($key);
    my $mod = $hash % ( $#nodes + 1 );
    my $server = $nodes[ $mod ];
    push @{ $nodes{ $server } }, $key;
}

foreach my $node ( sort keys %nodes ) {
    printf "%s: %s/n", $node,  join ",", @{ $nodes{$node} };
}

This Perl script demonstrates how to save the key "A" to "Z" to memcached and access it.
Save it as mod. pl and execute it.

First, when there are only three servers:

$ mod.pl node1 node2 nod3
node1: a,c,d,e,h,j,n,u,w,x
node2: g,i,k,l,p,r,s,y
node3: b,f,m,o,q,t,v,z

The result is as follows: node1 stores a, c, d, e ......, Node2 stores G, I, K ......,
Each server stores 8 to 10 data records.

Next we will add a memcached server.

$ mod.pl node1 node2 node3 node4
node1: d,f,m,o,t,v
node2: b,i,k,p,r,y
node3: e,g,l,n,u,w
node4: a,c,h,j,q,s,x,z

Node4. It can be seen that only D, I, K, P, R, and Y are hit. After adding a node
The keys distributed to the server will change significantly. Only six of the 26 keys are accessing the original server,
All others are moved to other servers. The hit rate is reduced to 23%. When memcached is used in Web applications,
When a memcached server is added, the cache efficiency will be greatly reduced, and the load will be concentrated on the database server,
It is possible that normal services cannot be provided.

This problem also exists in the use of Mixi web applications, resulting in the inability to add memcached servers.
However, with the new distributed method, you can easily add memcached servers.
This distributed method is called consistent hashing.

Consistent hashing

Regarding the idea of consistent hashing, the development blog of Mixi Corporation has been introduced in many places,
Here is a simple description.

Mixi engineers 'blog-wide spread faster than ever before.
Consistenthashing

A brief description of consistent hashing

Consistent hashing: first, obtain the hash value of the memcached server (node,
And configure it to 0 ~ 232
(Continuum.
Then, use the same method to obtain the hash value of the key for storing the data and map it to the circle.
Search clockwise from the location where the data is mapped, and save the data to the first server.
If more than 232
If you still cannot find the server, it will be saved to the first memcached server.

Figure 4 consistent hashing: Basic Principle

Add a memcached server from the status. The remainder distributed algorithm is greatly changed because the server that saves keys.
The cache hit rate is affected. However, in the consistent hashing, the server location is added to the continuum in a counter-clockwise manner.
Keys on the first server will be affected.

Figure 5 consistent hashing: Add a server

Therefore, consistent hashing minimizes key redistribution.
In addition, some consistent hashing implementation methods also adopt the idea of virtual nodes.
If a common hash function is used, the server's ing locations are unevenly distributed.
Therefore, the virtual node Concept is used for each physical node (server)
Allocate 100 ~ on Continuum ~ 200 points. In this way, uneven distribution can be restrained,
Minimize the cache redistribution when servers increase or decrease.

The memcached client function library that uses the consistent hashing algorithm described below is used to test the results,
The formula for calculating the hit rate after the number of servers (N) and the number of servers (m) increases is as follows:

(1-N/(n + M) * 100

Function libraries supporting consistent hashing

The cache: memcached does not support consistent hashing,
However, several client function libraries support this new distributed algorithm.
The first memcached client function library that supports consistent hashing and virtual nodes is
The PHP library named libketama was developed by last. FM.

Libketama-a consistent hashing algo for memcache clients-RJ has already existed-users at last. fm

As for the Perl client, 1st times of serialization
Cache: memcached: fast and cache: memcached: libmemcached support
Consistent hashing.

Cache: memcached: Fast-search.cpan.org
Cache: memcached: libmemcached-search.cpan.org

Both interfaces are similar to cache: memcached. If you are using cache: memcached,
This can be easily replaced. Cache: memcached: Fast implements libketama again,
You can specify the ketama_points option when using consistent hashing to create an object.

my $memcached = Cache::Memcached::Fast->new({
    servers => ["192.168.0.1:11211","192.168.0.2:11211"],
    ketama_points => 150
});

In addition, cache: memcached: libmemcached is a Perl module that uses the C function library libmemcached developed by brain Aker.
Libmemcached supports several distributed algorithms and consistent hashing,
Its Perl binding also supports consistent hashing.

Tangent software: libmemcached

Today in history: Website Security
Add to Del. icio. us

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More