Memcache the problem of storing large data (greater than 1m) __linux

Source: Internet
Author: User
Tags cas memcached web services cpu usage virtual private server

memcached Stores a single item maximum data is within 1MB, if the data exceeds 1M, access set and get are both return false and cause performance problems.

We have previously cached data on the charts, because the charts account for 30% of all our SQL SELECT queries, and our charts are updated hourly, so we have to cache the data. In order to clear the cache convenience, all the user's data is placed in the same key, because the Memcached:set did not compress the data. When testing the test, no problem found, when the online, the results found that the number of people on the line just 490, the server load average drift to 7.9. Then we remove the cache and drop down to 0.59 at a draught.

So MEMCAHCE is not suitable for caching large data, more than 1MB of data, you can consider in the client compression or split into multiple key. Large data takes a long time to load and uppack to memory, reducing server performance.

The largest storage object supported by Memcached is 1M. This value is determined by its memory allocation mechanism.

Memcached by default, a mechanism named slab allocator is used to allocate and manage memory. Prior to this mechanism, the allocation of memory was done simply by malloc and free for all records. However, this approach can lead to memory fragmentation, burdening the operating system memory manager, and, in the worst case, slower operating systems than the memcached process itself. Slab allocator was born to solve the problem. The basic principle of slab allocator is to split the allocated memory into a specific length block to completely resolve the memory fragmentation problem, according to the predetermined size.

Today (2012-03-16) we re-tested the memcached:: The data size of the set. Perhaps we use PHP's memcached extension is the most new version, set data is the default compression.   Set data: $ac = new memcahed (); $data = Str_repeat (' A ', 1024* 1024);   1M Data $r = $ac->set (' key ', $data, 9999); or $data = Str_repeat (' A ', 1024* 1024*100);//100m data $r = $ac->set (' key ', $data, 9999);

Both 1M data and 100M data can be set successfully. I later found that the Memcachedset data was compressed by default. Because this is a duplicate string, the compression rate is as high as 1000 times times. So the 100M data compression is actually 100k.

When I set: $ac->setoption (memcahed::opt_compression,0);   Store data is not compressed. $data = Str_repeat (' A ', 1024* 1024); 1M Data $r = $ac->set (' key ', $data, 9999); The data set of//1m is unsuccessful.

That is, memcached server cannot store more than 1M of data, but data that is less than 1M can be stored successfully after the client compresses the data.

Memcached Related knowledge:

1, the basic setting of memcached
1) Start the server side of Memcache

#/usr/local/bin/memcached-d-M 10-u root-l 192.168.0.200-p 12000-c 256-p/tmp/memcached.pid

The-D option is to start a daemon,
-M is the amount of memory allocated to Memcache, in megabytes, I am 10MB here,
-U is the user running memcache, I am here root,
-L is a listening server IP address, if there are more than one address, I specify the IP address of the server 192.168.0.200,
-P is the port that sets the memcache listening, I set 12000 here, preferably 1024 or more ports,
The-c option is the maximum number of concurrent connections to run, the default is 1024, I set 256 here, according to the load of your server to set,
-P is set to save the Memcache pid file, which I am here to save in/tmp/memcached.pid,

2 If you want to end the memcache process, execute:

# Kill ' Cat/tmp/memcached.pid '

The hash algorithm maps the binary value of any length to a smaller binary value of a fixed length, which is called a hash value. A hash value is a unique and extremely compact numeric representation of a piece of data. If you hash out a paragraph of plaintext and even change only the

A letter from a paragraph, and subsequent hashes will produce different values. It is computationally impossible to find two different inputs that hash the same value.

2, the application of memcached business scene.

1 If the site contains a very large number of dynamic Web pages, so the load on the database will be very high. Since most database requests are read, memcached can significantly reduce the database load.

2 If the load on the database server is low but CPU usage is high, then the computed results (computed objects) and the rendered page template (enderred templates) can be cached.

3 The use of memcached can cache session data , temporary data to reduce their database write operations.

4 cache Some files that are small but frequently accessed.

5 Caching Web ' services ' (non-IBM-promoted Web services, translator's note) or RSS feeds results ...

3, does not apply the memcached business scene.

1 The size of the cached object is greater than 1MB

Memcached itself is not designed to handle large multimedia (large media) and huge binary blocks (streaming huge blobs).

2 The length of the key is greater than 250 characters

3 The virtual host does not let the Operation memcached service

Virtualization technologies such as VMware and Xen are not ideal for running memcached if the application itself is hosted on a low-end virtual private server. Memcached needs to take over and control large chunks of memory, and memcached performance will be compromised if the memcached managed memory is swapped out by the OS or hypervisor.

4 application running in an unsafe environment

Memcached to provide any security policy, access to memcached can be accessed simply by Telnet. If your application is running on a shared system, you need to focus on security issues.

5 The business itself needs to be persistent data or to say that it needs to be database

4, can not traverse the memcached of all the item

The speed of this operation is relatively slow and it blocks other operations (this is slower than memcached other commands). Memcached all non-debug (non-debug) commands, such as add, set, get, Fulsh, whatever

How much data is stored in the memcached, and their execution consumes only constant time. The amount of time consumed by any command that traverses all of the item increases as the volume of data in the memcached increases. Blocking occurs when other commands cannot be executed because they are waiting (completed by the command that iterates through all the item). 5, memcached can accept the maximum length of a key is 250 characters

The maximum length of a key that memcached can accept is 250 characters. It should be noted that 250 is a memcached server-side internal limit. If the memcached client supports "key prefix" or similar features, then the maximum length of key (prefix + original key) can be longer than 250 characters. It is recommended that you use a shorter key, which saves memory and bandwidth. 6, the size of a single item is limited to 1M byte

Because the memory allocator algorithm is the case.

A detailed answer:

1 memcached memory storage engine, use slabs to manage memory. Memory is divided into slabs chunks of unequal size (first divided into equal size slabs, then each slab is divided into equal size chunks, and slab sizes of different chunk are unequal). The size of the chunk begins with a minimum number and increases by a factor until the maximum possible value is reached. If the minimum value is 400B, the maximum is 1MB, the factor is 1.20, and the size of each slab chunk is:

slab1-400b;slab2-480b;slab3-576b ... the greater the chunk in slab, the greater the gap between it and the slab in front. Therefore, the greater the maximum value, the lower the memory utilization. Memcached must allocate memory for each slab, so if you set a smaller factor and a larger maximum, you will need to provide more memory for the memcached.

2 Do not attempt to access large data in memcached, such as placing huge web pages in mencached. Because it takes a long time to load and unpack large data into memory, the performance of the system is rather bad. If you do need to store more than 1MB of data, you can modify the Slabs.c:power_block value and then recompile the memcached, or use the inefficient malloc/free. In addition, database, mogilefs and other schemes can be used instead of memcached systems. 7, the memcached memory allocator is how to work. Why not apply Malloc/free. Why you should use slabs.

In fact, this is a compile-time option. The internal slab allocator is used by default, and the built-in slab allocator should indeed be used. At the earliest, memcached used only malloc/free to manage memory. However, this approach does not work well with OS memory management before. Repeatedly malloc/free creates memory fragmentation, and the OS ultimately spends a lot of time looking for contiguous chunks of memory to satisfy malloc requests rather than running memcached processes. The slab allocator was born to solve this problem. Memory is allocated and divided into chunks, which has been reused. Because memory is divided into slabs of varying sizes, some memory is wasted if the size of the item is not appropriate for the slab that is chosen to store it.

8, memcached on the item's expiration time has any restriction.

The item object can have a maximum expiration date of up to 30 days. Memcached the incoming Expiration time (time period) is interpreted as a point of time, once at this point in time, memcached the item to a failure state, which is a simple but obscure mechanism.

9, what is binary protocol, whether need attention.

The binary protocol attempts to provide a more efficient and reliable protocol for the end, reducing the CPU time generated by client/server-side processing protocols. According to Facebook's test, parsing the ASCII protocol is the most CPU-consuming time in memcached.

Link.

10, the memcached memory allocator is how to work. Why not apply Malloc/free. Why you should use slabs.

In fact, this is a compile-time option. The internal slab allocator is used by default, and the built-in slab allocator should indeed be used. At the earliest, memcached used only malloc/free to manage memory. However, this approach does not work well with OS memory management before. Repeatedly malloc/free creates memory fragmentation, and the OS ultimately spends a lot of time looking for contiguous chunks of memory to satisfy malloc requests rather than running memcached processes. The slab allocator was born to solve this problem. Memory is allocated and divided into chunks, which has been reused. Because memory is divided into slabs of varying sizes, some memory is wasted if the size of the item is not appropriate for the slab that is chosen to store it.

11. Is the memcached atomic?

The

All single commands sent to the memcached are completely atomic. If you send a set command and a GET command for the same data at the same time, they do not affect each other. They will be serialized and executed successively. Even in multithreaded mode, all commands are atomic. However, the sequence of commands is not atomic. If you first obtain an item through the GET command, modify it, and then set it back to memcached, the system does not guarantee that the item is not being manipulated by other processes (process, not necessarily in the operating system). The memcached 1.2.5 and later versions provide gets and CAS commands that can solve the problem above. If you use the gets command to query item,memcached for a key, the item's current value's unique identification is returned. If the client program has overridden this item and wants to write it back to memcached, you can send that unique identifier to memcached with the CAS command. If the unique identification of the item in memcached is consistent with what you provide, the write operation will succeed. If another process modifies this item during this time, the unique identification of the item in memcached will change and the write operation will fail.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.