Memcached for Perl

Source: Internet
Author: User
Tags crc32 memcached rehash perl script

Use the memcached, write down

The data saved in the memcached is stored in the memcached built-in memory storage space. Because the data exists only in memory, restarting memcached and restarting the operating system can cause all data to disappear. In addition, the unused cache is automatically deleted based on the LRU (least recently Used) algorithm after the content capacity reaches the specified value. The memcached itself is a server designed for caching, so there is no undue concern about data permanence.

Memcached is a "distributed" caching server, but the server side does not have distributed functionality. Each memcached does not communicate with each other to share information. Being distributed depends entirely on the implementation of the client.

Memcached needs libevent Library #yum install libevent libevent-devel

$ wget http://www.danga.com/memcached/dist/memcached-1.x.x.tar.gz
$ tar zxf memcached-1.x.x.tar.gz
$ cd memcached-1.x.x$./configure
$ make
$ sudo make install
#/usr/local/bin/memcached-p 11211-m 64m-d
Options Description
-P The TCP port used. Default is 11211
-M Maximum memory size. The default is 64M
-vv Boot with very vrebose mode, debug information and error output to console
-D Start as daemon in the background

memcached Client API : HTTP://WWW.DANGA.COM/MEMCACHED/APIS.BML

Perl's Memcached client has cache::memcached cache::memcached::fast cache::memcached::libmemcached

And so several CPAN modules.

http://search.cpan.org/dist/Cache-Memcached/

#!/usr/bin/perl

Use strict;
Use warnings;
Use cache::memcached;

My $key = "Foo";
My $value = "bar";
my $expires = 3600; # 1 hour
My $memcached = Cache::memcached->new ({
Servers => ["127.0.0.1:11211"],
Compress_threshold => 10_000
});

$memcached->add ($key, $value, $expires);
My $ret = $memcached->get ($key);
print "$ret/n";

Here, you specify the IP address of the memcached server and an option for cache::memcached to generate an instance. The cache::memcached commonly used options are shown below.

Options Description
Servers specifying memcached servers and ports with arrays
Compress_threshold Values to use when compressing data
Namespace Specify the prefix to add to the key

In addition, cache::memcached can then be saved after serializing complex Perl data through the Storable module, so hashes, arrays, objects, and so on can be saved directly to the memcached. Save Data

The method of saving data to memcached has the same methods as add replace set:

My $add = $memcached->add (' key ', ' value ', ' term ');
My $replace = $memcached->replace (' key ', ' value ', ' term ');
My $set = $memcached->set (' key ', ' value ', ' term ');

You can specify a period of time (in seconds) when you save data to memcached. When the deadline is not specified, memcached saves the data according to the LRU algorithm. The differences between the three methods are as follows:

Options Description
Add Save only if no key data exists in the storage space
Replace Save only if there is data in the same key in the storage space
Set Unlike add and replace, save at any time
Get Data

Getting the data can use the Get and Get_multi methods.

My $val = $memcached->get (' key ');
My $val = $memcached->get_multi (' Key 1 ', ' Key 2 ', ' Key 3 ', ' Key 4 ', ' key 5 ');

Use Get_multi when you get more than one piece of data at a time. Get_multi can obtain multiple key values synchronously, at a speed of dozens of times times faster than a loop call get. Delete Data

Deleting data uses the Delete method, but it has a unique feature.

$memcached->delete (' key ', ' blocking time (sec) ');

Deletes the data for the key specified by the first parameter. The second parameter specifies a time value that prevents the use of the same key to save the new data. This feature can be used to prevent the incomplete caching of data. However, it is important to note that theset function ignores the blocking and saves the data as usual and minus one operation

You can use a specific key value on the memcached as a counter.

My $ret = $memcached->incr (' key ');
$memcached->add (' key ', 0) unless defined $ret;

Add one and minus one is atomic operation, but does not set the initial value, will not be automatically assigned 0. Therefore, error checking should be done, and initialization should be added if necessary. a distributed approach to cache::memcached

Perl's memcached client function library cache::memcached is Memcached's author Brad Fitzpatrick's work, which can be said to be the original library of functions. cache::memcached-search.cpan.org

The function library realizes the distributed function and is the memcached standard distributed method. calculates the dispersion based on the remainder

The cache::memcached distributed approach simply means "scatter according to the remainder of the server number." The integer hash value of the key is evaluated, divided by the number of servers, and the server is selected based on the remaining number.

Here's how to simplify the cache::memcached to the following Perl script.

Use strict;
Use warnings;
Use STRING::CRC32;

My @nodes = (' Node1 ', ' node2 ', ' node3 ');
My @keys = (' Tokyo ', ' Kanagawa ', ' Chiba ', ' Saitama ', ' Gunma ');

foreach my $key (@keys) {
My $CRC = CRC32 ($key); # CRC
My $mod = $crc% ($ #nodes + 1);
My $server = $nodes [$mod]; # Select server based on remainder
printf "%s =>%s/n", $key, $server;
}

Cache::memcached uses CRC when it is seeking a hash value. string::crc32-search.cpan.org

The CRC value of the string is first evaluated, and the server is determined by the remainder of the number of server nodes divided by the value. After the code above executes, enter the following results:

Tokyo       => Node2
Kanagawa => Node3
Chiba => Node2
Saitama => Node1
Gunma => Node1

According to the results, "Tokyo" dispersed to Node2, "Kanagawa" dispersed to node3 and so on. To put it another way, when the selected server fails to connect, Cache::memcached adds the number of connections to the key, computes the hash again and attempts to connect. This action is called rehash. When you do not want to rehash, you can specify the Rehash => 0 option when generating the Cache::memcached object. to compute the dispersion disadvantage based on the remainder

The method of remainder calculation is simple, and the dispersion of data is excellent, but it also has its disadvantages. That is, when the server is added or removed, the cost of a cache reorganization is significant. When you add a server, the remainder changes dramatically, which makes it impossible to get the same server as when you save, thereby affecting the cache hit rate. Write snippets of code in Perl to verify the cost.

Use strict;
Use warnings;
Use STRING::CRC32;

my @nodes = @ARGV;
My @keys = (' a '.. ' Z ');
My%nodes;

foreach my $key (@keys) {
My $hash = CRC32 ($key);
My $mod = $hash% ($ #nodes + 1);
My $server = $nodes [$mod];
Push @{$nodes {$server}}, $key;
}

foreach My $node (sort keys%nodes) {
printf "%s:%s/n", $node, join ",", @{$nodes {$node}};
}

This Perl script shows you how to save the Keys "a" through "Z" to memcached and access. Save it as mod.pl and execute it.

First, when the server is only three:

$ mod.pl Node1 Node2 nod3
Node1:a,c,d,e,h,j,n,u,w,x
Node2:g,i,k,l,p,r,s,y
Node3:b,f,m,o,q,t,v,z

As a result, Node1 saves A, C, D, e......,node2 save G, I, k ..., and each server holds 8 to 10 data.

Next, add a memcached server.

$ mod.pl node1 Node2 node3 node4
Node1:d,f,m,o,t,v
Node2:b,i,k,p,r,y
Node3:e,g,l,n,u,w
Node4:a,c,h,j,q,s,x,z

Added a node4. Visible, only D, I, K, p, R, y hit. Like this, the server where the key is dispersed after the node is added will change dramatically. Only six of the 26 keys are accessing the original server, and all others are moved to another server. The hit rate dropped to 23%. When you use memcached in a Web application, the instant cache efficiency of adding a memcached server is significantly reduced, the load is concentrated on the database server, and there is a risk that you will not be able to provide normal services.

This problem also applies to Mixi Web applications, which makes it impossible to add memcached servers. But with the new distributed approach, it is now easy to add memcached servers. This distributed approach is called consistent hashing. Consistent hashing

About consistent hashing thought, Mixi development blog and so on many places have introduced, here only briefly explained. Mixi engineers ' Blog-スマートな dispersed で Quick キャッシュライフconsistenthashing-コンシステントハッシュ method Consistent hashing simple description

The consistent hashing is as follows: first find the hash value of the memcached Server (node) and configure it to the 0~232 Circle (Continuum). The same method is then used to find the hash value of the key that stores the data and map it to the circle. Then start looking clockwise from where the data maps to, saving the data to the first server you find. If more than 232 still cannot find the server, it is saved to the first memcached server.

Fig. 4 Consistent hashing: fundamentals

Adds a memcached server from the state of the diagram above. Remainder distributed algorithm because the server that holds the key changes dramatically, it affects the cache hit rate, but in consistent hashing, only the keys on the first server where the server is added to the continuum are affected.

Figure 5 Consistent hashing: adding a server

Therefore, consistent hashing minimizes the redistribution of keys. Moreover, some consistent hashing methods also adopt the idea of virtual node. Using a generic hash function, the map location of the server is distributed very unevenly. Therefore, the idea of the virtual node is used to allocate 100~200 points on the continuum for each physical node (server). This can inhibit the uneven distribution, minimize the server increase or decrease when the cache redistribution.

The result of testing with the Memcached client function library using the consistent hashing algorithm, described later in this article, is that the number of server units (n) and the increased number of server units (m) calculate the hit-rate formula after the server is added as follows:

(1-n/(n+m)) a function library that supports consistent hashing

Although the cache::memcached in this series are not supported by consistent hashing, there are several client libraries that support this new distributed algorithm. The first memcached client function library that supports consistent hashing and virtual nodes is a PHP library called Libketama, developed by Last.fm. Libketama-a consistent hashing algo for memcache clients–rjブログ-Users at Last.fm

As for the Perl client, the Cache::memcached::fast and cache::memcached::libmemcached supported consistent hashing, as described in the serial 1th time. Cache::memcached::fast-search.cpan.org cache::memcached::libmemcached-search.cpan.org

Both interfaces are almost identical to cache::memcached, and if you are using cache::memcached, you can easily replace them. Cache::memcached::fast Libketama, you can specify the ketama_points option when you create an object using consistent hashing.

My $memcached = Cache::memcached::fast->new ({
Servers => ["192.168.0.1:11211", "192.168.0.2:11211"],
Ketama_points => 150
});

In addition, Cache::memcached::libmemcached is a Perl module that uses the C function library libmemcached developed by Brain Aker. The libmemcached itself supports several distributed algorithms, as well as consistent hashing, whose Perl bindings also support consistent hashing. Tangent software:libmemcached
Today in history: website Security 2008-07-29
Collection to: Del.icio.us

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.