Problems encountered in traditional mysql+ memcached architectures
The actual MySQL is suitable for massive data storage, through the memcached to load hot data to the cache, speed up access, many companies have used such a structure, but with the increasing volume of business data, and the continuous growth of traffic, we encounter a lot of problems:
1.MySQL needs to be continuously disassembled, memcached also need to continuously follow the expansion, expansion and maintenance work occupy a lot of development time.
2.Memcached and MySQL database data consistency issues.
3.Memcached data hit rate is low or down, a lot of access directly through to the Db,mysql cannot be supported.
4. Cross-room cache sync problem.
Many NoSQL blossom, how to choose
In recent years, the industry has been emerging many kinds of nosql products, so how to properly use these products, maximize their strengths, is that we need to further research and think about the problem, in fact, the most important thing is to understand the positioning of these products, and understand the tradeoffs of each product , in the practical application to achieve weaknesses, in general, these nosql mainly used to solve the following problems
1. Small amount of data storage, high-speed read and write access. This kind of product through the data all In-momery way to guarantee the high speed access, simultaneously provides the data landing function, actually this is the Redis most main application scenario.
2. Massive data storage, distributed system support, data consistency guarantee, convenient cluster node Add/delete.
3. The most representative of this is the ideas set out in the 2 essays by Dynamo and BigTable. The former is a completely non-central design, the node through the gossip way to pass the cluster information, the data to ensure the final consistency, the latter is a centralized scheme design, through a similar distributed lock service to ensure strong consistency, data written to write memory and redo log, The periodic compat are then merged onto the disk, and the random write is optimized for sequential writes, which improves write performance.
4.Schema free,auto-sharding and so on. For example, some of the common document databases are support Schema-free, directly store JSON format data, and support functions such as auto-sharding, such as MongoDB.
In the face of these different types of nosql products, we need to choose the most appropriate product based on our business scenario.
Redis application scenario, how to use it correctly
As already analyzed, Redis is best suited for all data in-momory scenarios, although Redis also provides persistence, but actually more of a disk-backed function, compared to the traditional meaning of persistence there is a big difference, then you may have questions, It seems that Redis is more like an enhanced version of memcached, so when to use memcached, when to use Redis?
If you simply compare the differences between Redis and memcached, most of them will get the following ideas:
1 Redis not only supports simple k/v-type data, but also provides storage of data structures such as List,set,zset,hash.
2 Redis supports backup of data, that is, Master-slave mode of data backup.
3 Redis supports data persistence, which keeps the in-memory data on disk and can be loaded again when it is restarted.
Aside from this, you can delve into the internal structure of Redis to see more essential differences and understand the design of Redis.
In Redis, not all data is stored in memory all the time. This is one of the biggest differences compared to memcached. Redis only caches all key information, and if Redis finds that memory usage exceeds a certain threshold, it will trigger swap operations, and Redis is based on "swappability = Age*log (size_in_memory)" Figure out which key corresponds to the value that requires swap to disk. The value corresponding to these keys is then persisted to disk and purged in memory. This feature allows Redis to maintain data that is larger than the memory size of its machine itself. Of course, the memory of the machine itself must be able to maintain all the keys, after all, the data will not be swap operations. Also, since Redis swaps the in-memory data to disk, the main thread that provides the service and the sub-thread that is doing the swap will share this memory, so if you update the data that needs swap, REDIS will block the operation until the sub-thread completes the swap operation before it can be modified.
Use the Redis-specific memory model before and after the case comparison:
VM off:300k keys, 4096 bytes values:1.3g used
VM on:300k keys, 4096 bytes values:73m used
VM off:1 million keys, bytes values:430.12m used
VM on:1 million keys, bytes values:160.09m used
VM on:1 million keys, values as large as you want, still:160.09m used
When reading data from Redis, if the value of the key being read is not in memory, then Redis needs to load the data from the swap file before returning it to the requester. There is a problem with the I/O thread pool. By default, Redis will be blocked, that is, all swap files will be loaded before the corresponding. This strategy has a small number of clients and is appropriate for batch operations. However, if you apply Redis to a large web site application, this is obviously not sufficient for large concurrency scenarios. So Redis runs we set the size of the I/O thread pool, and concurrently operates on read requests that need to load the corresponding data from the swap file, reducing blocking time.
If you want to use Redis in an environment of massive data, I believe it is essential to understand the memory design and blocking scenarios of Redis.
Complementary points of knowledge:
Comparison of Memcached and Redis
1 Network IO Model
Memcached is a multi-threaded, non-blocking IO multiplexing network model, divided into the main thread and the worker sub-thread, listening thread listening network connection, after accepting the request, the connection description Word pipe to the worker thread, read/write IO, the network layer using the Libevent encapsulated event Library , multithreading model can play a multi-core role, but the introduction of the cache coherency and lock problem, for example, memcached most commonly used stats command, the actual memcached all operations to the global variable lock, count, etc., resulting in performance loss.
(memcached network IO model)
Redis uses a single-threaded IO multiplexing model, which encapsulates a simple Aeevent event processing framework that implements Epoll, Kqueue, and select, which can be used to maximize the speed advantage for purely IO operations. However, Redis also provides some simple computing functions, such as sorting, aggregation, etc., for these operations, the single-threaded model can actually seriously affect the overall throughput, CPU calculation process, the entire IO schedule is blocked.
2. Memory management aspects
Memcached uses a pre-allocated pool of memory to manage memory using slab and chunk of different sizes, item selects the appropriate chunk storage based on size, the way memory pools can save the cost of requesting/freeing memory, and can reduce memory fragmentation, But this approach also leads to a certain amount of wasted space, and when memory is still large, new data may be rejected for reference to Timyang's article: http://timyang.net/data/Memcached-lru-evictions/
Redis uses on-site memory storage to store data, and rarely uses free-list to optimize memory allocation, and there is a degree of memory fragmentation, and the Redis data store command parameters, which store the time-to-date information separately, and call them temporary data. Non-temporary data is never removed, even if there is not enough physical memory, so that swap will not eliminate any non-temporal data (but will attempt to eliminate some temporary data), which is more appropriate for Redis as storage instead of the cache.
3. Data consistency issues
Memcached provides a CAS command that guarantees consistency of the same data for multiple concurrent access operations. Redis does not provide CAS commands, and this is not guaranteed, but Redis provides the functionality of a transaction that guarantees the atomicity of a sequence of commands and is not interrupted by any action.
4. Storage methods and other aspects
Memcached basically only supports simple key-value storage, does not support enumeration, and does not support persistence and replication functions.
In addition to Key/value, Redis supports numerous data structures such as list,set,sorted Set,hash, which provides the keys
enumeration, but not on-line, if you need to enumerate online data, Redis provides tools to scan its dump files, enumerate all the data, and Redis also provides the functionality of persistence and replication.
5. Client support for different languages
Memcached and Redis have rich third-party clients to choose from for different language clients, but because memcached has been developing for a longer period of time, many of Memcached's clients are more mature and stable in terms of client support at present. and Redis because its protocol itself is more complex than memcached, plus the author constantly add new features, and so on, the corresponding third-party client tracking speed may not catch up, sometimes you may need to make some changes on the basis of third-party clients to better use.
According to the above comparisons it is not difficult to see that when we do not want the data to be kicked out, or need more data types other than Key/value, or need to use the landing function, using Redis is more appropriate than using memcached.
Some of the peripheral features of Redis
In addition to being stored as storage, Redis also provides some other functions, such as aggregation calculation, pubsub, scripting, etc., for such functions need to understand its implementation principle, clearly understand its limitations, can be used correctly, such as pubsub function, This is actually not supported by any persistence, the consumer connection between the flash or the reconnection between the message is all lost, and such as aggregation calculation and scripting and other features are limited by the Redis single-threaded model, it is impossible to achieve high throughput, need to use caution.
In general, the Redis author is a very diligent developer who can often see that the author is experimenting with a variety of new ideas and ideas, and that the functionality of these areas requires that we need to know more about them before using them.
Summarize:
The best way to use 1.Redis is to in-memory all data.
2.Redis more scenes are used as substitutes for memcached.
3. It is more appropriate to use Redis when more data type support is required other than key/value.
4. Using Redis is more appropriate when the stored data cannot be excluded.
Transfer from http://gnucto.blog.51cto.com/3391516/998509
Z The difference between Redis and memcached