Memcached and Redis , as the Key-value database of two Inmemory, have many common places in design and thought, and in many cases, functions and applications ( Used as a distributed cache server, etc. ) is also very similar, here put the two together to do a comparison of the introduction
Basic architecture and ideas
First, briefly introduce the architecture and design ideas of both
Memcached
Memcached uses a client - server architecture, client and server-side communication uses a custom protocol standard, and the client Library can be implemented in any language, as long as the protocol format requirements are met.
From the user's point of view, the server maintains a key - value relationship of the data table, the server is independent of each other, do not share data and do not do any communication operations. The client needs to know all the servers and is responsible for managing the allocation of data across the servers.
On the server side, internal data storage, using Slab -based memory management, helps reduce the overhead of memory fragmentation and frequent allocation of memory destruction. Each Slab dynamically allocates a page of memory on demand ( Unlike the concept of 4Kpage, where the default page is 1M),page Internal according to different slab class size and then divided into memory chunk for the server storage KV key value pair use
the basic Application model for Memcached is as shown
Redis
The basic application pattern of Redis is similar to that of memcached, and it is not difficult to find out whether Redis can completely replace memcached using the Internet.
the data structure of Redis will eventually be implemented into the corresponding form of key-value, but it is richer than memcached, except for the standard common meaning of key-value pairs, which are revealed to the user. Redis also supports data structures such as List, set, hashes, Sorted set, etc.
Basic commands
Memcached command or communication protocol is very simple, thecommand that the Server supports is basically the addition, deletion, substitution, atom updating, reading, etc. of the specific key, including Set, Get, add, Replace, Append, Inc/dec, etc.
Memcached communication protocols include text format and binary format for simple network client tools (such as telnet) and different requirements for clients with higher performance requirements
The Redis command provides basic operations similar to Memcached on KV (String type) , and also supports basic similar operations on other data structures (and of course, the operations that are unique to these structures, such as Set of Union, pop of List, etc.) and support more data structures, to a certain extent, also means a wider range of applications
In addition to support for a variety of data structures,Redis offers a number of additional features compared to Memcached, such as the subscribe/publish command, to support publishing / Subscription patterns such as notification mechanisms and so on, these additional features also help to expand its application scenarios
Redis 's client - Server communication protocol is fully text-based (in the future possible server communication will be in binary format)
Transaction
Redis uses commands such as multi/watch/exec to support the concept of transactions and atomically execute a batch of commands. In later versions of 2.6, support for script scripts has been added, and scripts are inherently executed as transaction transactions, and are easier to use, so do not exclude future cancellations the possibility of Multi and other command interfaces
in the application mode of Memcached, There is no support for the transaction except for atomic operation commands such as Increment/decrement
Data backup, validity, persistence, etc.
Memcached does not guarantee the validity of stored data,Slab internally based LRU also automatically obsolete old data, the client cannot assume the current state of the data on the server side, which should be said to be memcached Feature settings, users do not need to care too much or to manage their own data of the elimination of the update work, of course, whether it is suitable for your application, depending on the specific needs, it may be you need to accurately control the Cache life cycle of an obstacle
Memcached also does not work on data persistence, but there are many Memcached protocol-based projects that enable data persistence, such as memcachedb using BerkeleyDB for data storage, But essentially it's not a Cache Server, it's just a protocol compatible with Memcached key-valuedata Store .
Redis can configure servers in a Master-slave manner,slave nodes replica backups of data , andslave nodes can act as Read only node to share data read work
Redis built -in supports two persistence scenarios,snapshot snapshots and AOF incremental Log methods. The snapshot, as the name implies, is to Dump the complete data in a file over a period of time . AOF Increment log is a record of the modification of the data operation (in fact, each of the data generated by the modification of the command itself), the two scenarios can coexist, there are pros and cons, see Http://redis.io /topics/persistence
The above Redis data backup persistence scheme, and so on, if not required, in order to improve performance, can also fully Disable
Performance
In terms of performance, both have some of their own considerations and implementations
Memcached
Memcached itself does not proactively periodically check and flag which data needs to be eliminated, check the timestamp only when the relevant data is read again, or further examine the LRU data If the memory is not enough to be used to proactively retire the data
Redis
Redis supports pipeline and script techniques in order to reduce the network communication time overhead of a large number of small data CMD operations (Round)
- The so-called pipeline is to support in a single communication, send multiple commands to the server batch execution, the cost is that the server side needs more memory to cache the query results.
- Redis embeds a LUA parser that executes lua scripts that can be executed directly through commands such as eval, or can be uploaded to the server side using script load. re- use in script cache
Both of these methods can effectively reduce the network communication overhead, increase the data throughput rate
For KV operations,both Memcached and Redis support multiple Get and Set commands (Memcached the multiple Set command appears to be supported only in binary protocols, which also facilitates performance improvements
In terms of actual performance, there are many test comparisons on the web, and the results are different, which is undoubtedly related to test cases, test environments, and the client Library implementations used in testing . But overall look down, the more reliable conclusion is in the kv class operation, the performance of the two is close, the structure of Memcached is more simple, theoretically should be slightly faster.
Cluster
the server side of Memcached is completely independent of each other, the client usually determines the partition of the data by applying the hash algorithm to the key value, in order to reduce the influence of the server's increase and decrease on the Hash result, the large area cache fails, and the majority of clients achieve the consistency Hash algorithm
Redis plans to build support for the cluster on the server side, but the code is still in the Alpha phase (looks like it's been two or three years?). Prior to this, it is also possible to assume that each Redis Server instance is completely independent of each other and relies on the client to handle the partitioning algorithm and the work of the available server list management.
Redis's officially recommended client library for sharding is the twemproxy of Twitter 's Open source project , twemproxy simultaneously supports Text Communication protocol for Memcached and Redis.
It is important to note that many of theRedis commands do not work correctly in a clustered environment, such as set intersection, transactional operations across nodes, and so on, because the current redis cluster design The fundamental goal is to report to each other between the server survival status, and the data to do honor backup balance load, etc., in essence, the data of cross-node operation does not provide any additional support, so at the level of data services, the individual servers are still completely independent.
If these operations must be implemented, of course, can be implemented through client-side code (efficiency is high and not said), similar problems memcached cluster will certainly encounter, but the original memcached does not support complex operations and data types, Many of the arithmetic logic is originally handled by the client code or the application itself.
MR class Batch processing applications
Providing traversal operations of a specified scope is one of the keys to supporting batch application logic like MapReduce, but in a hash -based It is not easy to provide such support on the basis of a data structure that is stored in a way (or that it is not easy to achieve an efficient range or traversal operation)
the Redis support Scan operation is used to traverse the dataset, which is based on its internal data structure and implementation constraints, guaranteeing that all data at the start of the scan can be obtained, but there is no guarantee that duplicate data will not be returned, which needs to be checked by the client , or the client doesn't care about it. The Scan operation also supports the match condition to filter the key value, although there are some limitations, such as the match condition comparison is performed after the acquisition of data, efficiency is a problem, the more obvious problem is not guaranteed every time The iterate process of scan can return the same amount of valid data.
For range operations,Redis 's Ordered Set supports ordering of the specified data at insert time (score), andthen supports various operations within the specified score range, Although a range operation with a string-based or custom datum is not supported , such a scope operation has great limitations (or needs to satisfy a particular application pattern), but it is better than no
The Memcached core protocol itself does not support the operation of any scope class, nor does it support traversal operations, or even the official legal enumeration of all keys , which is largely due to its design ideas and streamlined architecture.
However, there are some compatible memcached protocol Server implemented scope class operation, the specific format can refer to https://code.google.com/p/memcached/wiki/RangeOps the recommended standard
In addition , the hashes data structure of Redis can meet the application logic requirement of acquiring specific subset data to some extent.
In summary, if you want to implement a scan operation similar to HBase support, either redis or memcached, but for Redis , can be used for batch class applications, not generalize, depends on the specific data format logic and usage. It is still possible to implement support for the application logic of the MR batch, or the scope query class, to some extent by adjusting the way the application uses the data appropriately. Some data structures that are distributed in a larger contiguous space, are uncertain in number, and do not map well to numeric values, and then use ordered set to handle such a structure, should still be difficult for efficient partition traversal.
Memcached and Redis architecture analysis and differentiation comparison