1 problems with the mysql+memcached architecture
Memcached uses a client - server architecture, client and server-side communication uses a custom protocol standard, and the client Library can be implemented in any language , as long as the protocol format requirements are met.
The memcached server uses Slab-based memory management to help reduce the overhead of memory fragmentation and frequent allocation of memory destruction. Each Slab dynamically allocates a page of memory on demand ( Unlike the concept of 4Kpage, where the default page is 1M),page Internal according to the size of different slab class and then divided into memory chunk for the server storage KV key value pair use (slab mechanism equivalent to the memory pool mechanism, the implementation of a large chunk of memory from the operating system, and then Memcached manages this memory for its own allocation and recycling. )
The actual MySQL is suitable for massive data storage, through the memcached to load hot data to the cache, speed up access, many companies have used such a structure, but with the increasing volume of business data, and the continuous growth of traffic, we encounter a lot of problems:
1.MySQL needs to be continuously disassembled, memcached also need to continuously follow the expansion, expansion and maintenance work occupy a lot of development time.
2.Memcached and MySQL database data consistency issues.
3.Memcached data hit rate is low or down, a lot of access directly through to the Db,mysql cannot be supported.
4. Cross-room cache sync problem.
and memcache is not suitable for storing big data, a single item Max is 1M, if the data exceeds 1M, access set and get are all return false, and cause performance problems. The item object can expire up to 30 days in length. Memcached the incoming Expiration time (time period) is interpreted as a point in time, once at this point in time, memcached the item to a failed state, which is a simple but obscure mechanism.
Is the memcached atomic?
All the individual commands that are sent to the memcached are completely atomic. If you send a set command and a GET command for the same data at the same time, they do not affect each other. They will be serialized and executed successively. Even in multithreaded mode, all commands are atomic. However, the command sequence is not atomic. If you first get an item with a GET command, modify it, and then set it back to memcached, the system does not guarantee that the item is not manipulated by another process (process, not necessarily an operating system). Memcached 1.2.5 and later, the Get and CAS commands are available, and they solve the problem above. If you use the GET command to query a key, Item,memcached returns a unique identifier for the item's current value. If the client program overwrite this item and want to write it back to memcached, you can send that unique identity together with the memcached by using the CAS command. If the item's unique identity in the memcached is consistent with what you provide, the write operation will succeed. If the item is also modified by another process during this time, the unique identity of the item stored in the memcached will change, and the write operation will
Failed.
2 about NoSQL:
Many NoSQL blossom, how to choose
In recent years, the industry has been emerging many kinds of nosql products, so how to properly use these products, maximize their strengths, is that we need to further research and think about the problem, in fact, the most important thing is to understand the positioning of these products, and understand the tradeoffs of each product , in the practical application to achieve weaknesses, in general, these nosql mainly used to solve the following problems
1. Small amount of data storage, high-speed read and write access. This kind of product through the data all In-momery way to guarantee the high speed access, simultaneously provides the data landing function, actually this is the Redis most main application scenario.
2. Massive data storage, distributed system support, data consistency guarantee, convenient cluster node Add/delete.
3.Schema free,auto-sharding and so on. For example, some of the common document databases are support Schema-free, directly store JSON format data, and support functions such as auto-sharding, such as MongoDB.
In the face of these different types of nosql products, we need to choose the most appropriate product based on our business scenario.
Redis is best suited for all data in-momory scenarios, although Redis also provides persistence, but actually more of a disk-backed function, compared to the traditional sense of persistence there is a big difference, then you may have questions, It seems that Redis is more like an enhanced version of memcached, so when to use memcached, when to use Redis?
Comparison of Memcache and Redis:
- Performance: There's no need to be overly concerned about performance because both are already high enough. Since Redis uses only single cores, and memcached can use multicore, on average, Redis has a higher performance than memcached for storing small data on a per-core scale. In more than 100k of data, memcached performance is higher than Redis, although Redis has recently been optimized for the performance of storing big data, but it is slightly inferior to memcached. Having said so much, the conclusion is that no matter which one you use, the number of requests per second will not be a bottleneck. (such as bottlenecks may be in the network card)
- memory usage Efficiency: using simple key-value storage, memcached memory utilization is higher, and if Redis uses hash structure to do key-value storage, because of its combined compression, Its memory utilization is higher than memcached. Of course, this is related to your application scenario and data characteristics.
- Data Persistence: If you have requirements for data persistence and data synchronization, it is recommended that you choose Redis because neither of these features memcached. Choosing Redis is also wise, even if you just want the cached data to be not lost after upgrading or rebooting the system.
- data structure: Of course, finally, you need to talk about your specific application needs. Redis has more data structures and supports richer data operations than memcached, usually in memcached, you need to get the data to the client for similar modifications and set it back. This greatly increases the number of network IO and the volume of data. In Redis, these complex operations are often as efficient as the general get/set. So, if you need caching to support more complex structures and operations, Redis is a good choice.
- Network IO Model aspects: Memcached is multi-threaded, divided into listening threads, worker threads, and introducing locks, resulting in performance loss. Redis uses a single-threaded IO multiplexing model to maximize the speed advantage and provide a simpler computing capability
- memory management aspects: Memcached uses a pre-allocated pool of memory to create a certain amount of wasted space and when memory is still large, new data may be rejected, and Redis uses on-site memory to store data without culling any non-temporal data Redis is more suitable as a storage instead of a cache
- consistency of data: Memcached provides the CAS command to guarantee. While Redis provides transactional functionality, it guarantees the atomicity of a sequence of commands that are not interrupted by any action.
- If you simply compare the differences between Redis and memcached, most of them will get the following ideas:
-
1, Redis not only supports simple k/v type of data, but also provides the storage of data structures such as List,set,zset,hash.
2, Redis support data backup, that is, Master-slave mode of data backup.
3, Redis support data persistence, you can keep the in-memory data on the disk, restart the time can be loaded again for use.
4, Redis can achieve master-slave replication, to achieve fault recovery.
5. Redis's sharding technology: it's easy to distribute data across multiple Redis instances
3.Redis Common data types
String
Hash
List
Set
Sorted Set
Pub/sub
Transactions
The 1.Strings data structure is a simple key-value type, and value is not just a string, it can also be a number.
- Common commands: Set,get,decr,incr,mget and so on.
-
Application Scenarios: String is the most commonly used type of data, and normal Key/value storage can be categorized as such. It is possible to fully implement the current Memcached functionality and be more efficient. You can also enjoy Redis's timed persistence, operation logs, and replication functions. In addition to providing operations like get, set, INCR, DECR, and so on, Redis also provides the following Memcached:
- Get string length
- Append content to a string
- Set and get a section of a string
- Set and get one of the strings (bit)
- Bulk sets the contents of a series of strings.
- Implementation method: String in the Redis internal storage By default is a string, referenced by Redisobject, when encountered INCR,DECR and other operations will be converted to a numerical type for calculation, at this time
- Redisobject's encoding field is int
- 2.hash
Common commands:Hget,hset,hgetall and so on.
scenario: in memcached, we often package structured information into HashMap, which is stored as a string value after the client is serialized, such as the user's nickname, age, gender, integral, and so on, when one of these items needs to be modified. It is usually necessary to remove all values after deserialization, modify the value of an item, and then serialize the store back. This not only increases the overhead, but also does not apply to some scenarios where concurrent operations are possible (for example, two concurrent operations need to modify the integral). The hash structure of Redis allows you to modify only one item property value just as you would update a property in a database.
Let's simply cite an example to describe the application scenario for a hash, such as storing a user information object data that contains the following information:
The user ID is the key to find, the stored value user object contains the name, age, birthday and other information, if the ordinary key/value structure to store, mainly has the following 2 kinds of storage methods:
- The disadvantage of using the user ID as a lookup key to encapsulate other information as a serialized object is to increase the cost of serialization/deserialization and to retrieve the entire object when one of the information needs to be modified, and the modification operation requires concurrency protection. Introduce complex problems such as CAs.
- That is, the key is still the user ID, value is a map, the map key is a member of the property name, value is the property value, so that the data can be modified and accessed directly through its internal map key (Redis called internal map key field), This means that the corresponding attribute data can be manipulated by key (user ID) + field (attribute tag), without the need to store the data repeatedly and without the problem of serialization and concurrency modification control. A good solution to the problem.
-
It is also important to note that Redis provides an interface (Hgetall) that can fetch all of the property data directly, but if the internal map has a large number of members, it involves traversing the entire internal map, which can be time-consuming due to the Redis single-threaded model. The other client requests are not responding at all, which requires extra attention.
Implementation method:
The above has been said that the Redis hash corresponds to value inside the actual is a hashmap, actually there will be 2 different implementations, this hash of the members of the relatively small redis in order to save memory will be similar to a one-dimensional array to compact storage, without the use of a real HASHMAP structure , the encoding of the corresponding value Redisobject is Zipmap, and when the number of members increases, it automatically turns into a true hashmap, at which time encoding is HT.
- 3List
- Common commands:Lpush,rpush,lpop,rpop,lrange, etc.
- Application Scenarios: Redis list has a lot of applications and is one of the most important data structures of redis, such as Twitter watchlist, fan list, etc. can be implemented using Redis's list structure.
- Lists are linked lists, and people who believe that they have a knowledge of data structures should be able to understand their structure. With the lists structure, we can easily achieve the latest message ranking and other functions. Another application of the lists is Message Queuing, which can take advantage of the lists push operation to present the task in lists, and then the worker thread then takes the task out of execution with a pop operation. Redis also provides an API for manipulating a section of lists, where you can directly query and delete elements from a section of lists. Implementation method: The implementation of Redis list is a doubly linked list, which can support reverse lookup and traversal, but it is more convenient to operate, but it brings some additional memory overhead, and many implementations within Redis, including sending buffer queues, are also used in this data structure.
- 4 Set
- Common commands: sadd,spop,smembers,sunion and so on.
-
Application Scenarios:
The functionality provided by Redis set externally is a list-like feature, except that set is automatically weight-saving, and set is a good choice when you need to store a list of data and you don't want duplicate data. and set provides an important interface to determine whether a member is within a set set, which is not available in list.
The concept of a sets collection is a combination of a bunch of distinct values. Using the sets data structure provided by Redis, you can store some aggregated information, such as in a microblog application, where you can have a collection of all the followers of a user and a collection of all their fans. Redis also provides for the collection of intersection, set, difference sets and other operations, can be very convenient to achieve such as common concern, common preferences, two-degree friends and other functions, to all of the above collection operations, you can also use different commands to choose whether to return the results to the client or save set into a new collection.
Implementation method:
The internal implementation of set is a value that is always null hashmap, which is actually calculated by hashing the way to fast weight, which is also set to provide a judge whether a member is within the set of reasons
Common commands:
Zadd,zrange,zrem,zcard, etc.
Usage scenarios:
The usage scenario for Redis sorted set is similar to set, except that the set is not automatically ordered, and the sorted set can be ordered by the user with an additional priority (score) parameter, and is inserted in an orderly, automatic sort. When you need an ordered and non-repeating collection list, you can choose sorted set data structures, such as the public Timeline of Twitter, which can be stored as score in the publication time, which is automatically sorted by time.
Also can use sorted sets to do with the weight of the queue, such as the normal message score is 1, the important message of the score is 2, and then the worker can choose to press score reverse order to get work tasks. Let important tasks take precedence.
Implementation method:
Redis sorted set internal use HashMap and jump Table (skiplist) to ensure the storage and ordering of data, HashMap in the member to score mapping, and the jumping table is all the members, sorted by HashMap in the score , the use of the structure of the jumping table can obtain a relatively high efficiency, and the implementation is relatively simple
REDIS Data Structures