MySQL High performance memcached (2)

Last Update:2014-12-19 Source: Internet

Author: User

Tags memcached unique id

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article describes the issues that need to be noted in deploying Memcached and the memcached distributed algorithm

Whether you are a new online system or a system that has been online for a long time. We can all simply configure the memcached, but we need to be aware of the following issues before we configure them:

1.memcached is only a caching mechanism. It shouldn ' t is used to store information It cannot
Otherwise afford to lose and then load from a different location.
2.There is no security built into the memcached protocol. At a minimum, make sure that the servers
Running memcached is only accessible from inside your network, and that the network ports being
Used is blocked (using a firewall or similar). If the information on the memcached servers
Being stored is any sensitive and then encrypt the information before storing it in memcached.
3. memcached does not provide any sort of failover. Because There is no communication between
Different memcached instances. If An instance fails, your application must capable of removing it from
The list, reloading the data and then writing data to another memcached instance.
4. Latency between the clients and the memcached can be a problem if you are using different physical
Machines for these tasks. If you find the latency are a problem, move the memcached instances to
be on the clients.
5. Key length is determined by the memcached server. The default maximum key size is bytes.
6. Try to use at least-memcached instances, especially for multiple clients, to avoid have a single
Point of failure. Ideally, create as many memcached nodes as possible. When adding and removing
Memcached instances from a pool, the hashing and distribution of key/value pairs could be affected.
7.Use namespace.the memcached Cache is a very simple massive key/value storage system, and as such there are no
The compartmentalizing data automatically into different sections. For example, if is storing
Information by the unique ID returned from a MySQL database and then storing the data from the different
Tables could run into issues because the same ID might is valid in both tables.
Some interfaces provide an automated mechanism for creating namespaces when storing information
into the cache. In practice, these namespaces was merely a prefix before a given ID that's applied
Every time a value is stored or retrieve from the cache.
You can implement the same basic principle by using keys that describe the object and the unique
Identifier within the key that's supply when the object is stored. For example, when storing user data,
Prefix the ID of the user with User:or user-.

Memcached distribution algorithms:
The memcached client interface supports a number of different distribution algorithms that is used in
multi-server configurations to determine which host should is used when setting or getting data from
A given memcached instance. When you get or set a value, a hash is constructed from the supplied
Key and then used to select a host from the list of configured servers. Because the hashing mechanism
Uses the supplied key as the basis for the hash, the same server is selected during both set and get
Operations.
You can think of this process as follows. Given an array of servers (a, B, and C), the client uses a
Hashing algorithm that returns a integer based on the key being stored or retrieved. The resulting
Value is then used to select a server from the list of servers configured in the client. Most standard
Client hashing within Memcache clients uses a simple modulus calculation on the value against the
Number of configured memcached servers. You can summarize the process in pseudocode as:
@memcservers = [' A.memc ', ' b.memc ', ' C.MEMC '];
$value = hash ($key);
$chosen = $value% Length (@memcservers);
Replacing the above with values:
@memcservers = [' A.memc ', ' b.memc ', ' C.MEMC '];
$value = hash (' myID ');
$chosen = 7009 3;
In the above example, the client hashing algorithm chooses the server at index 1 (7009 3 = 1),
and store or retrieve the key and value with the server.

Using This method provides a number of advantages:
? The hashing and selection of the server to the handled entirely within the client. This
Eliminates the need to perform network communication to determine the right machine to contact.
? Because the determination of the memcached server occurs entirely within the client, the server can
Be selected automatically regardless of the operation being executed (set, get, increment, etc.).
? Because the determination is handled within the client, the hashing algorithm returns the same value
For a given key; Values is not affected or reset by differences in the server environment.
? Selection is very fast. The hashing algorithm on the key value is quick and the resulting selection of
The server is from a simple array of available machines.
? Using Client-side Hashing simplifies the distribution of data over each memcached server. Natural
Distribution of the values returned by the hashing algorithm means that keys is automatically spread
Over the available servers.
Providing that list of servers configured within the client remains the same, the same stored key
Returns the same value, and therefore selects the same server.
However, if you don't use the same hashing mechanism and then the same data could be recorded
On different servers by different interfaces, both wasting space on your memcached and leading to
Potential differences in the information.

The problem with client-side selection of the "the", the list of the servers (including their
Sequential order) must remain consistent on each client using the Memcached servers, and the servers
Must be available. If you try to perform an operation in a key when:
? A new Memcached instance have been added to the list of available instances
? A memcached instance have been removed from the list of available instances
? The order of the memcached instances has changed
When the hashing algorithm are used on the given key, but with a different list of servers, the hash
Calculation choose a different server from the list.
If A new memcached instance is added into the list of servers, as NEW.MEMC are in the example below,
Then a GET operation using the same key, myID, can result in a cache-miss. This is because the same
Value is computed from the key, which selects the same index from the array of servers, but index 2
Now points to the new server, not the server C.MEMC where the data is originally stored. This would

Result in a cache miss, even though the key exists within the cache on another memcached instance.

This means is servers C.MEMC and NEW.MEMC both contain the information for key myID, but the

Information stored against the key in Eachs server is different in each instance. A more significant
Problem is a much higher number of cache-misses when retrieving data, as the addition of a new
Server changes the distribution of keys, and this is turn requires rebuilding the cached data on the

Memcached instances, causing an increase in database reads.

The same effect can occur if you actively manage the list of servers configured in your clients, adding
and removing the configured memcached instances as each instance is identified as being available.
For example, removing a memcached instance if the client notices that the instance can no longer
Be contacted can cause the server selection-to-fail as described here.
To prevent this causing significant problems and invalidating your cache, you can select the hashing
Algorithm used to select the server. There is common types of hashing algorithm, consistent and
Modula.
With consistent hashing algorithms, the same key when applied to a list of servers always uses the
Same server to store or retrieve the keys, even if the list of configured servers changes. This means
That's can add and remove servers from the Configure list and always use the same server for a
Given key. There is types of consistent hashing algorithms available, Ketama and Wheel. Both
Types is supported by libmemcached, and implementations is available for PHP and Java.
Any consistent hashing algorithm have some limitations. When you add servers to an existing list of
Configured servers, keys is distributed to the new servers as part of the normal distribution. When you
Remove servers from the list, the keys is re-allocated to another server within the list, and meaning that
The cache needs to is re-populated with the information. Also, a consistent hashing algorithm does not
Resolve the issue where you want consistent selection of a server across multiple clients, but where
Each client contains a different list of servers. The consistency is enforced only within a single client.
With a modula hashing algorithm, the client selects a server by first computing the hash and then
Choosing a server from the list of configured servers. As the list of servers changes, so the server
Selected when using a Modula hashing algorithm also changes. The result is the behavior described
Above Changes to the list of servers mean that different servers is selected when retrieving data,
Leading to caches misses and increase in database load as the cache is re-seeded with information.
If you use only a single memcached instance for each client, or your list of memcached servers
Configured for a client never changes, then the selection of a hashing algorithm are irrelevant, as it has
No noticeable effect.
If You change your servers regularly, or you use a common set of servers that is shared among a
Large number of clients, then using a consistent hashing algorithm should help to ensure that your
Cache data is not duplicated and the data is evenly distributed.

Memory Allocation within memcached:
When you first start memcached, the "Memory" and "configured" are not automatically allocated.
Instead, memcached only starts allocating and reserving physical memory once you start saving
information into the cache.
When you start to store data into the cache, memcached does not allocate the memory for the data
On a item by item basis.Instead, a slab allocation is used to optimize memory usage and prevent
Memory fragmentation when information expires from the cache.
With slab allocation, memory was reserved in blocks of 1MB. The slab is divided to a number of
Blocks of equal size. When you try to store a value into the cache, memcached checks the size of the
Value that is adding to the cache and determines which slab contains the right size allocation for
The item. If a slab with the item size already exists, the item was written to the block within the slab.
If The new item is bigger than the size of any existing blocks, then a new slab are created, divided up into
Blocks of a suitable size. If a existing slab with the right block size already exists, but there is no free
Blocks, a new slab is created. IF you update a existing item with data, larger than the existing
Block allocation for this key, then the key was re-allocated into a suitable slab.
For example, the default size for the smallest block is bytes (+ bytes of value, and the default 48
Bytes for the key and flag data). If the size of the first item you store to the the cache is less than 40
Bytes, then a slab with a block size of bytes is created and the value stored.
If the size of the data so you intend to store is larger than this value and then the block size is increased
By the chunk size factor until a block size large enough to hold the value is determined. The Block size
is all a function of the scale factor, rounded up to a block size which are exactly divisible into the
Chunk size.

MySQL High performance memcached (2)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More