Analysis of redis Architecture Design

Source: Internet
Author: User
Tags rehash
I. Preface

Because redis is used in recent projects, we need to have a deep understanding of redis design and implementation in order to better understand redis and apply it to suitable business scenarios.

The analysis process is based on entering from the main and gradually analyzing the redis startup process. At the same time, according to the redis initialization process, understand the functions and principles of each redis module.
Ii. redis startup process 1. initialize the server variable and set the default values related to redis 2. read the configuration file and receive the parameters passed in the command line. Replace the default value 3 set by the server. initialize the server function module. This step initializes processes, client linked lists, shared objects, initialized data, and network connections. reload data from RDB or aof 5. preparations before the network listener service starts 6. enable event listening and start to accept client requests.


The following is a detailed understanding of each module during the startup process. (Currently, only the background thread system and slow query log system are analyzed)


Iii. When using redis, many people say a problem, that is, what should I do if redis is down? Whether data will be lost or not. Now let's take a look at the data persistence solution provided by redis, and analyze the advantages and disadvantages through the principle. Finally, we can find the applicable application scenarios of redis. 1. RDB Persistence Solution

When redis is running, the RDB program saves the database snapshots in the current memory to the disk. When redis needs to be restarted, the RDB program will restore the database by reloading the RDB file. From the above description, we can see that RDB mainly includes two functions: For RDB implementation, See src/RDB. ca) Save (rdbsave)
Rdbsave saves the database data in the memory to the disk in RDB format. If the RDB file already exists, the existing RDB file will be replaced. The main process is blocked when the RDB file is saved. During this period, new client requests cannot be processed until the storage is complete. To avoid congestion of the main process, redis provides the rdbsavebackground function. Call rdbsave in the newly created sub-process. After the sub-process is saved, it sends a signal to the main process. At the same time, the main process can continue to process new client requests.

B) read (rdbload)
When redis is started, the configuration persistence mode determines whether to read the RDB file and save the object to the memory. During the RDB loading process, every 1000 keys are loaded to process a client request waiting for processing. However, currently, only subscription commands (publish, subscribe, psubscribe, UNSUBSCRIBE, and punsubscribe) are processed ), all other errors are returned. Because the publishing and subscription functions are not written to the database, that is, they are not stored in the redis database.
Disadvantages of RDB: when talking about the disadvantages of RDB, we need to mention that RDB has the concept of storage points. In the default redis. conf file, you can see the following default configuration:
#save <seconds> <changes>
Save 900 1 # If one key is modified within 15 minutes
Save 300 10 # If 10 keys are modified within 6 minutes
Save 60 10000 # If 10000 keys are modified within 60 seconds
If any of the above conditions is met, snapshots will be saved. To ensure Io read/write performance will not become the bottleneck of redis, a large value is generally created as a storage point. 1. If the storage point is too large, too much data will be lost during downtime. The storage point setting is too small, which may cause Io bottleneck 2. when saving data, the operation may be time-consuming due to the large dataset, which may cause redis to be unable to process client requests in a short time.
2. The aof persistence scheme records all write commands for the database to the aof file in the form of protocol text to record the database status. A) Save
1. convert the client request command to the network protocol format. append the Protocol content string to the variable server. 3 In aof_buf. when the aof system reaches the set conditions, aof_fsync (File description symbol) is called to write data to the disk.

The setting conditions mentioned in step 3 are the key points of aof performance. Currently, redis supports three Storage conditions:

1. aof_fsync_no: Do not save

In this mode, each time you run a Client Command, the Protocol string is appended to server. aof_buf, but not written to the disk.

Write occurs only in:

1. redis is shut down normally

2. The aof function is disabled.

3. The system write cache is full or the background save operation is executed regularly.

In the preceding three cases, the main process is blocked, causing client requests to fail.

2. aof_fsync_everysecs: save every second

The latter sub-process calls write and save, and does not block the main process. If a crash occurs, the maximum data loss is within 2 seconds.This is also the default setting option.

3. aof_fsync_always: each command is saved once.

In this mode, each client command can be saved to prevent data loss. However, the disadvantage is that the performance is greatly reduced because each operation is dedicated and the main process needs to be blocked.

B) read
Aof stores data in the data protocol format. Therefore, you only need to convert the data in aof to a command and run the simulation client again to restore all database statuses. The read process is as follows: 1. Create a simulated client. 2. Read the text stored in aof and restore the data to the original command and parameters. Then use the simulated client to send this command request. 3. Continue to step 2 until the aof file is read.
Aof needs to save all commands to the disk, so the file will become larger and larger over time. Reading will also become slow. Redis provides an aof rewrite mechanism to help reduce the file size. The implementation idea is as follows:
LPUSH list 1 2 3 4 5
LPOP list
LPOP list
LPUSH list 1
Four commands will be saved to the aof file at first. However, after being rewritten by aof, it will become a command:
LPUSH list 1 3 4 5
In addition, the aof rewrite cache concept is added to the consideration that aof writes are not affected during aof rewriting. That is to say, when redis enables aof, in addition to writing command format data to the aof file, it also writes to the aof rewrite cache. In this way, aof writes and overwrites are isolated, ensuring that the write will not be blocked during rewriting.

C) aof rewrite Process
1. aof rewrite will send a signal to the main process. write all data in the aof rewrite cache to the file. use the new aof file to overwrite the original aof file.

D) aof disadvantages
1. aof files are generally larger than RDB files of the same dataset. 2. Performance in aof mode and RDB mode depends on the fsync mode used by Aof.
The following is a diagram of the persistent operations on the server when the client requests redisserver.



IV. Implementation of redis Database
Redis is a key-Value Pair database, called a key space. To achieve this type of Kv storage, redis uses two data structure types: 1. Dictionary
The redis dictionary uses the hash table implementation. We are not going to detail the implementation of the redis hash table. However, when redis implements a hash table,
A good rehash solution is provided. This solution has a good idea and can even be derived from other applications. The solution name is "progressive rehash ".

The method for implementing hash tables is similar, but why does each open source software always develop its own unique hash data structure?
From the study of the hash Implementation of the PHP kernel and the implementation of redis hash, it is found that the Application Scenario determines that the customization is required for better performance. (For PHP hash implementation, see hashtable, an artifact in the PHP kernel)
A) PHP is mainly used in Web scenarios. In Web scenarios, data in a single request is isolated, and the number of hash entries is limited. Therefore, it is fast to perform a rehash operation.
Therefore, the PHP kernel uses the blocking form rehash, that is, the rehash operation cannot be performed on the current hash table.

B) Looking at redis, the resident process receives client requests to process various transactions, and the data to be operated is related and the data volume is large. If the PHP kernel is used, the following methods will appear:
When a hash table is rehash, all client requests are blocked, and the concurrent performance is greatly reduced.

Initialization dictionary illustration:
New Dictionary element diagram:

Rehash execution process:


Note: Some parts may be incorrectly understood. If there are any errors, you can point them out.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.