Application Scenario: The Application Scenario of Redis sorted set is similar to that of set. The difference is that set is not automatically ordered, while sorted set can provide an additional priority (score) by users) to sort the members, and insert them in order, that is, automatic sorting. When you need an ordered and non-repeated list of sets, you can choose sorted set data structure. For example, twitter's public timeline can be stored as score by posting time, in this way, the query is automatically sorted by time.
Implementation Method: Redis sorted set uses HashMap and SkipList internally to ensure data storage and order. In HashMap, Members are mapped to scores, the hop table stores all the members, and the sorting is based on the score saved in HashMap. using the structure of the hop table, you can get a high search efficiency and the implementation is relatively simple.
2. Different memory management mechanismsIn Redis, not all data is stored in the memory. This is the biggest difference from Memcached. When the physical memory is used up, Redis can swap some value that has not been used up for a long time to the disk. Redis only caches information about all keys. If Redis finds that the memory usage exceeds a threshold value, it triggers the swap operation. According to "swappability = age * log (size_in_memory) calculate the key value that needs to be swap to the disk. Then, the values corresponding to these keys are persisted to the disk and cleared in the memory. This feature allows Redis to maintain data that exceeds the memory size of its machine. Of course, the memory of the machine must be able to maintain all keys. After all, the data will not be operated by swap. At the same time, because Redis will provide the Service Main Line and the sub-threads that perform the swap operation will share this part of the memory when swap data is sent to the disk, if the swap data is updated, redis will block this operation until the subthread completes the swap operation. When reading data from Redis, if the value corresponding to the read key is not in memory, Redis needs to load the corresponding data from the swap file and then return it to the requester. Here there is an I/O thread pool problem. By default, Redis will be congested, that is, after all swap files are loaded, it will respond accordingly. This policy is suitable for batch operations when the number of clients is small. However, if Redis is applied to a large website application, this obviously cannot meet the needs of high concurrency. Therefore, when Redis runs, we set the size of the I/O thread pool, and perform concurrent operations on the read requests that need to load the corresponding data from the swap file to reduce the blocking time.
For memory-based database systems such as Redis and Memcached, the efficiency of memory management is a key factor affecting system performance. The malloc/free function in the traditional C language is the most common method for allocating and releasing memory. However, this method has many drawbacks: first, for developers, mismatched malloc and free may cause memory leakage. Second, frequent calls may cause a large number of memory fragments to be recycled and reused, reducing memory utilization. Finally, they are called as a system, the system overhead is far greater than that of common function calls. Therefore, to improve memory management efficiency, efficient memory management solutions do not directly use malloc/free calls. Redis and Memcached both use their own memory management mechanisms, but there are great differences in implementation methods. The memory management mechanisms of the two are described below.
Memcached uses the Slab Allocation mechanism to manage the memory by default. Its main idea is to split the allocated memory into blocks of a specific length based on the predefined size to store key-value data records of the corresponding length, to completely solve the memory fragmentation problem. The Slab Allocation mechanism is designed only to store external data. That is to say, all key-value data is stored in the Slab Allocation system, other memory requests of Memcached are applied through common malloc/free, because the number and frequency of these requests determine that they will not affect the performance of the entire system. The Slab Allocation principle is quite simple. First, it applies for a large block of memory from the operating system, splits it into chunks of various sizes, and divides the chunks of the same size into Slab classes. The Chunk is the minimum unit used to store key-value data. The size of each Slab Class can be controlled by setting the Growth Factor when Memcached is started. Assume that the value of Growth Factor in the figure is 1.25. If the size of the first Chunk is 88 bytes, the size of the second Chunk is 112 bytes.
When Memcached receives the data sent from the client, it first selects the most appropriate Slab Class based on the size of the received data, then, you can find a Chunk that can be used to store data by querying the list of idle chunks in the Slab Class saved by Memcached. When a database expires or is discarded, the chunks occupied by the record can be recycled and added to the idle list again.
From the above process, we can see that Memcached's memory management system is highly efficient and will not cause memory fragmentation, but its biggest drawback is that it will lead to a waste of space. Because each Chunk is allocated with memory space of a specific length, the extended data cannot make full use of the space. As shown in, 100 bytes of data are cached to 128 bytes of Chunk, and the remaining 28 bytes are wasted.
Redis memory management is mainly implemented through the zmalloc. h and zmalloc. c files in the source code. To facilitate memory management, Redis stores the memory size in the header of the memory block after allocating a piece of memory ., Real_ptr is the pointer returned by redis after calling malloc. Redis stores the size and size of the memory block into the header. The size occupies a known memory size, which is a length of the size_t type and returns ret_ptr. When the memory needs to be released, ret_ptr is passed to the memory management program. Through ret_ptr, the program can easily calculate the value of real_ptr, and then pass real_ptr to free to release the memory.
Redis records all memory allocations by defining an array. The length of this array is ZMALLOC_MAX_ALLOC_STAT. Each element of the array represents the number of memory blocks allocated by the current program, and the size of the memory block is the subscript of the element. In the source code, this array is zmalloc_allocations. Zmalloc_allocations [16] indicates the number of allocated 16-bytes memory blocks. The static variable used_memory in zmalloc. c is used to record the total size of the allocated memory. So, in general, Redis uses mallc/free packaging, which is much simpler than Memcached's memory management method.
3. Data Persistence supportAlthough Redis is a memory-based storage system, it supports the persistence of memory data and provides two main persistence policies: RDB snapshot and AOF log. Memcached does not support data persistence.
1) RDB SnapshotRedis supports the persistence mechanism of saving the snapshot of the current data into a data file, that is, the RDB snapshot. But how does one generate a snapshot for a database that is continuously written? Redis uses the copy on write mechanism of the fork command. When a snapshot is generated, the current process is fork out of a sub-process, and then all the data is recycled in the sub-process to write the data into an RDB file. We can use the save command of Redis to configure the time when RDB snapshots are generated. For example, if you configure to generate snapshots within 10 minutes, you can also configure to generate snapshots after 1000 writes, you can also implement multiple rules together. These rules are defined in the Redis configuration file. You can also use the Redis config set command to SET rules during Redis running without restarting Redis.
The RDB file of Redis will not be broken because the write operation is performed in a new process. When a new RDB file is generated, the child process generated by Redis first writes the data to a temporary file, and then renames the temporary file to the RDB file through an atomic rename system call. This causes a fault at any time, redis RDB files are always available. At the same time, the Redis RDB file is also a part of the internal implementation of Redis master-slave synchronization. RDB has its own shortcomings, that is, once a database problem occurs, the data stored in our RDB file is not completely new, data from the last RDB file generation to Redis downtime is all lost. In some businesses, this is tolerable.
2) AOF logThe full name of AOF logs is append only file, which is an append log file. Unlike the binlog of a general database, the AOF file is identifiable plain text, and its content is a standard Redis command. Only commands that may cause data modification will be appended to the AOF file. Every command to modify data generates a log, and the AOF file will become larger and larger. Therefore, Redis provides another function called AOF rewrite. Its function is to regenerate an AOF file. The operation of a record in the new AOF file will only be performed once, unlike an old file, operations on the same value may be recorded multiple times. The generation process is similar to that of RDB. It is also a fork process that directly traverses data and writes data to a new AOF temporary file. During the process of writing new files, all write operation logs will still be written to the old AOF file and recorded in the memory buffer. After the operation is completed, all logs in the buffer zone are written to the temporary file at one time. Then, call the atomic rename command to replace the old AOF file with the new AOF file.
AOF is a file write operation. It aims to write operation logs to disks, so it will also encounter the write operation process we mentioned above. After calling write on AOF in Redis, The appendfsync option is used to control the time when fsync is called to write data to the disk. The following three appendfsync settings gradually increase the security intensity.
- Appendfsync no when appendfsync is set to no, Redis will not take the initiative to call fsync to synchronize the AOF log content to the disk, so all this depends on the operating system debugging. For most Linux operating systems, fsync is performed every 30 seconds to write data in the buffer zone to the disk.
- Appendfsync everysec when appendfsync is set to everysec, Redis performs an fsync call every second by default, and writes data in the buffer zone to the disk. However, when the fsync call lasts for more than 1 second. Redis will adopt the fsync delay policy and wait a second. That is, perform fsync two seconds later. This fsync will be executed no matter how long it will take. At this time, because the file descriptor will be blocked during fsync, the current write operation will be blocked. The conclusion is that in most cases, Redis performs fsync every second. In the worst case, an fsync operation is performed in two seconds. This operation is called group commit in most database systems. It combines the data of multiple write operations and writes logs to the disk at one time.
- Appednfsync always: When appendfsync is set to always, fsync is called for every write operation. Data is the safest. Of course, because fsync is executed every time, therefore, its performance will also be affected.
For general business requirements, we recommend that you use RDB for persistence because RDB overhead is much lower than AOF logs. For applications that cannot bear data loss, we recommend that you use AOF logs.
4. Differences in cluster managementMemcached is a full-memory data buffer system. Although Redis supports data persistence, the full memory is the essence of its high performance. As a memory-based storage system, the physical memory size of the machine is the maximum data size that the system can accommodate. If the amount of data to be processed exceeds the physical memory size of a single machine, you need to build a distributed cluster to expand the storage capacity.
Memcached itself does not support distributed storage. Therefore, you can only use distributed algorithms such as consistent hash on the client to implement distributed storage of Memcached. The distributed storage implementation architecture of Memcached is provided. Before the client sends data to the Memcached cluster, the target node of the data is calculated using the built-in distributed algorithm, and the data is directly sent to the node for storage. However, when the client queries data, it also needs to calculate the node where the query data is located, and then directly sends a query request to the node to obtain the data.
Compared with Memcached, apsaradb for Redis only uses clients for distributed storage. Redis prefers to build distributed storage on the server. The latest version of Redis supports distributed storage. Redis Cluster is an advanced Redis version that implements distributed and allows single point of failure (spof). It has no central node and features linear scalability. The Distributed Storage Architecture of Redis Cluster is provided. The nodes communicate with each other through the binary protocol, and the nodes communicate with each other through the ascii protocol. In terms of data placement policies, Redis Cluster divides the entire key numeric field into 4096 hash slots, and each node can store one or more hash slots, that is to say, the maximum number of nodes currently supported by Redis Cluster is 4096. The distributed algorithm used by Redis Cluster is also simple: crc16 (key) % HASH_SLOTS_NUMBER.
To ensure data availability under spof, Redis Cluster introduces Master nodes and Slave nodes. In Redis Cluster, each Master node has two Slave nodes for redundancy. In this way, the downtime of any two nodes in the entire cluster will not cause data unavailability. When the Master node exits, the cluster automatically selects a Server Load balancer node to become the new Master node.
References:
- Http://www.redisdoc.com/en/latest/
- Http://memcached.org/
From: http://h2ex.com/1223
Address: http://www.linuxprobe.com/redisVSmemcached.html