Memcache the problem of storing large amounts of data

Source: Internet
Author: User
Tags cas memcached php memcached virtual private server

Memcache The problem of storing big data Huangguisu

Memcached stores a single item maximum data is within 1MB, assuming that the data exceeds 1M, access set and get are both return false and cause performance problems.

We previously cached the data for the leaderboard. Because the leaderboard accounts for 30% of all of our SQL SELECT queries, and our leaderboards are updated hourly, you must cache the data.

In order to clear the cache convenient, put all the user's data in the same key, because memcached:set when the data is not compressed. No problem was found when testing the test. When on-line, the results found. When the number of people online was just 490, serverload average floated to 7.9. Then we removed the cache and dropped to 0.59.

So MEMCAHCE is not suitable for caching big data, more than 1MB data key

Memcached the largest storage object supported is 1M . This value is determined by its memory allocation mechanism.

< Span lang= "Ar-sa" style= "font-size:10.5pt" > memcached By default, named slab Allocator mechanism to allocate and manage memory. Once the mechanism appeared, the allocation of memory was made by simply malloc and free

However, such a way can lead to memory fragmentation, aggravating the burden of the operating system memory manager, the worst case scenario. Causes the operating system to be slower than the memcached process itself. Slab Allocator was born to solve the problem.

Slab Allocator The basic principle is to cut the allocated memory into blocks of a specific length according to the predetermined size. To completely resolve the memory fragmentation problem.

Today (2012-03-16) we tried again. Memcached:: Set data size. It may be that we use PHP memcached extension is the latest version, set data when the default compression. Set data:

$ac = new memcahed (); $data = Str_repeat (' A ', 1024* 1024); 1M data "  =  $ac->set (' key ', $data, 9999);//or $data = Str_repeat (' A ', 1024* 1024*100);//100m data"  =  $ac->set (' key ', $data, 9999);

< Span lang= "Ar-sa" style= "font-size:10.5pt" > Both 1M data and 100M data can be set successfully.

Later I found that the memcachedSet data is compressed by default.

Because this one is a repeating string, the compression rate is up to 1000 times times. So 100M data compression is actually 100k.

$ac->setoption (memcahed::opt_compression,0); Store data is not compressed. $data = Str_repeat (' A ', 1024* 1024); 1M Data  =  $ac->set (' key ', $data, 9999);//1m data set is unsuccessful.

< Span lang= "Ar-sa" style= "font-size:10.5pt" > This means that memcached server cannot store more than 1M of data, but

Memcached Related knowledge:

1, the basic settings of memcached
1) Start the server side of the Memcache

#/usr/local/bin/memcached-d-M 10-u root-l 192.168.0.200-p 12000-c 256-p/tmp/memcached.pid

The-D option is to start a daemon,
-M is the amount of memory allocated to Memcache, in megabytes, I am 10MB.
-U is the user who executes memcache. I'm root here.
-L is the ServerIP address of the listener. If there are multiple addresses, I specify the IP address of the server 192.168.0.200,
-P is the port that sets memcache monitoring. I set the 12000 here, preferably a port above 1024,
The-c option is the maximum number of concurrent connections to execute, the default is 1024, I set the 256 here, according to the load of your server to set,
-P is set to save memcache PID file, I am here to save in/tmp/memcached.pid.

2) Assume that you want to end the memcache process. Run:

# Kill ' Cat/tmp/memcached.pid '

Hashing Algorithm maps a random-length binary value to a small, fixed-length binary value. This small binary value is called a hash value. A hash value is a unique and extremely compact numeric representation of a piece of data. Suppose you hash a clear text and even just change the

A letter of the paragraph. Subsequent hashes will produce different values. It is not possible to find two different inputs that hash the same value.

2, the application of memcached business scenarios?

1) Suppose that the site includes a dynamic Web page with a very large number of visitors. Thus the load on the database will be very high. Because most database requests are read, memcached can significantly reduce the database load.

2) Assuming that the database server is relatively low-load but CPU-efficient, it is possible to cache the computed results (computed objects) and the rendered page template (enderred templates).

3) Use memcached to cache session data and temporary data to reduce write operations to their databases.

4) cache Some very small files that are frequently visited.

5) caching the results of web ' services ' (non-IBM advertised Web services, translator notes) or RSS feeds:

3, not applicable to memcached business scenarios?

1) The size of the cache object is greater than 1MB

Memcached itself is not designed to handle huge multimedia (large media) and huge binary blocks (streaming huge blobs).

2) key is longer than 250 characters

3) Virtual hosting does not allow the execution of memcached services

Assume that the app itself is hosted on a low-end virtual private server. Virtualization technologies such as VMware, such as Xen, are not suitable for performing memcached. Memcached need to take over and control large chunks of memory. Assuming that Memcached managed memory is swapped out by the OS or hypervisor, the performance of memcached will be compromised.

4) Application execution in an unsafe environment

Memcached provides access to memcached with no security policy, only via Telnet.

Assuming that the application executes on a shared system, you need to focus on security issues.

5) The business itself needs to be persistent data or what is needed should be database

4. You cannot traverse all the item in memcached

The speed of this operation is relatively slow and the other operations are blocked (this is slower than memcached other commands). Memcached all non-debug (non-debug) commands, such as add, set, get, Fulsh, whatever

How much data is stored in the memcached, and they run with only constant time.

No matter what time it takes to run a command that traverses all of the item. will be added as the amount of data in the memcached is added. The other commands cannot be run because they wait (the command to traverse all item runs). Thus the blockage will take place.

5. The maximum length of a key that memcached can accept is 250 characters

The maximum length of a key that memcached can accept is 250 characters.

It is important to note that 250 is the internal limit of the Memcachedserver end. Assume that the memcachedclient used supports "key prefixes" or similar features. Then the maximum length of the key (prefix + original key) can be more than 250 characters. It is recommended to use shorter keys. This can save memory and bandwidth.

6. The size of a single item is limited to 1M bytes

This is because of the memory allocator algorithm.

The specific answer:

1) memcached memory storage engine, use slabs to manage memory. Memory is divided into slabs chunks of unequal size (first divided into equal size slabs, then each slab is divided into equal size chunks, slab of different chunk size is unequal). The size of the chunk starts from a minimum number and grows by a factor. Until the maximum possible value is reached. Assume that the minimum value is 400B, the maximum value is 1MB, and the factor is 1.20. The size of the chunk of each slab is:

slab1-400b;slab2-480b;slab3-576b. Slab the greater the chunk. The greater the gap between it and the front slab. So. The maximum value is greater. The lower the memory utilization. Memcached must pre-allocate memory for each slab, so assuming that a smaller factor and a larger maximum value are set, there will be a lot of other memory available for memcached.

2) do not attempt to access very large data to memcached, such as putting huge web pages into the mencached. Because it takes a very long time to load and unpack big data into memory, the performance of the system is not good. Assuming that you do need to store more than 1MB of data, you can change the value of Slabs.c:power_block. Then compile the memcached again, or use the inefficient malloc/free. Other than that. Be able to replace memcached system with database, mogilefs and other schemes.

7. How does the memcached memory allocator work? Why not apply malloc/free!

? Why use slabs?

In fact, this is a compile-time option. The internal slab allocator is used by default, and the built-in slab allocator should indeed be used. At the earliest, memcached only used Malloc/free to manage memory. However, such a way does not work well with the memory management of the OS ever. Repeatedly malloc/free causes memory fragmentation, and the OS finally spends a lot of time looking for contiguous blocks of memory to meet malloc requests, rather than executing the memcached process. The slab dispenser was born to solve the problem. The memory is allocated and divided into chunks. has been used repeatedly. Because memory is divided into different sizes of slabs. If the size of the item is not very appropriate for the slab that is chosen to store it, some memory is wasted.

8. What are the restrictions on the expiration time of item memcached?

The item object expires up to 30 days in length. Memcached interprets the incoming expiration time (time period) as a point in time. Once at this point in time, memcached the item to a failed state. This is a simple but obscure mechanism.

9, what is the binary agreement, need to pay attention?

The binary protocol attempts to provide a more efficient and reliable protocol for the end, reducing the CPU time generated by the processing protocol at the Client/server end.

Based on Facebook's test. Parsing the ASCII protocol is the most CPU time consumed in memcached

Link.

10. How does the memcached memory allocator work? Why not apply Malloc/free.? Why use slabs?

As a matter of fact. This is a compile-time option. The internal slab allocator is used by default, and the built-in slab allocator should indeed be used.

At the earliest, memcached only used Malloc/free to manage memory. However, such a way does not work well with the memory management of the OS ever.

Repeatedly malloc/free causes memory fragmentation, and the OS finally spends a lot of time looking for contiguous blocks of memory to meet malloc requests, rather than executing the memcached process. The slab dispenser was born to solve the problem. The memory is allocated and divided into chunks, which has been reused. Because memory is divided into different sizes of slabs. If the size of the item is not very appropriate for the slab that is chosen to store it, some memory is wasted.

11. Is the memcached atomic?

All the individual commands that are sent to the memcached are completely atomic. Suppose you send a set command and a GET command at the same time for the same data, and they don't affect each other. They will be serialized and run successively.

Even in multithreaded mode. All the commands are atomic. However, the command sequence is not atomic. Suppose you first get an item with a GET command, change it, and then set it back to memcached, the system does not guarantee that the item is not manipulated by another process (process, not necessarily an operating system). The memcached 1.2.5 and higher version numbers provide the Get and CAS commands, which solve the problem above.

Suppose the item,memcached of a key using the GET command returns a unique identifier for the item's current value.

Suppose the client program has covered this item and wants to write it back to memcached. The ability to send that unique identity together with the CAS command to memcached. Assuming that the item's unique identity in the memcached is consistent with what you provide, the write operation will succeed. Assuming another process has changed this item during this time, the unique identity of the item stored in the memcached will change, and the write operation will

Failed.


Learn more about Memcached's memory allocation mechanism:

http://cjjwzs.javaeye.com/blog/762453

Memcache the problem of storing large amounts of data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.