Analyze Redis Architecture Design

Last Update:2015-11-24 Source: Internet

Author: User

Tags rehash savepoint

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, preface

Because Redis is being used in recent projects, a deep understanding of redis design and implementation is needed in order to better understand redis and apply it in the appropriate business scenarios.My analysis process is based on the entry from main, and the gradual in-depth analysis of the Redis startup process. At the same time, according to the Redis initialization process, understand the functions and principles of each Redis module.
Second, the Redis startup process1. Initialize the server variable, set the Redis-related default value 2. Reads the configuration file and receives the parameters passed in the command line, replacing the default value of server Settings 3. Initializes the server function module. In this step the initialization includes process signal processing, client linked list, shared object, initialization data, initialization of the network connection, etc. 4. Overloading data from an RDB or aof 5. Pre-start of network monitoring service 6. Turn on event snooping to start accepting client requests
The starting part of the process is more intuitive by looking at it.

The following is a detailed understanding of each module during the startup process. (currently only the background threading system and the slow query log system are analyzed)
iii. Redis data Persistence schemeA lot of people say a problem when using Redis, so what if Redis goes down? Data loss and so on. Now look at the data persistence solution provided by Redis and analyze the pros and cons through the principles. Finally, the application scenario for Redis is available.
1.RDB Persistence Solutionat Redis Runtime, the RDB program saves the current in-memory db snapshot to disk, and when Redis needs to be restarted, the RDB program restores the database by overloading the Rdb file. as can be seen from the above description, the RDB mainly consists of two functions:the implementation of the RDB can be seen in SRC/RDB.Ca) Save (rdbsave)

The
Rdbsave is responsible for saving the in-memory database data to disk in the RDB format and replacing the existing Rdb file if the Rdb file already exists. The main process is blocked during the time the Rdb file is saved, and the new client request will not be processed until the save is complete. to prevent the main process from blocking, Redis provides the Rdbsavebackground function. Call Rdbsave in the new child process, which sends a signal to the main process when the save is complete, while the master process can continue to process the new client request.

b) Read (rdbload)

when Redis starts, it determines whether to read the Rdb file and save its objects in memory, based on the configured persistence mode. in the process of loading an RDB, each load of 1000 keys processes a client request that has been waiting to be processed, but currently only commands for the subscription function (PUBLISH, SUBSCRIBE, Psubscribe, unsubscribe, punsubscribe) are processed. All others return an error message. Because the Publish subscription feature is not written to the database, that is, the Redis database is not present.
Disadvantages of the RDB: in terms of the RDB disadvantage, it is necessary to mention that the RDB has a savepoint concept. You can see this default configuration in the default redis.conf: [Plain]View Plaincopy

#save <seconds> <changes>

[Plain]View Plaincopy

Save 1 #如果15分钟内 with 1 keys modified

[Plain]View Plaincopy

Save #如果6分钟内, 10 keys have been modified

[Plain]View Plaincopy

Save 10000 #如果60秒内有10000个键被修改

This means that when any of the above conditions are met, a snapshot is saved. To ensure that IO read/write performance does not become a redis bottleneck, a larger value is typically created as a savepoint. 1. At this point, if the savepoint settings are too large, it can result in too much data loss. The save point setting is too small to cause an IO bottleneck 2. When saving data, the operation can be time-consuming due to the large data set, which can cause Redis to be unable to process client requests for a short period of time.

2.AOF Persistence Solutionin the form of protocol text, all write commands to the database are logged to the AoF file, which is the purpose of recording the database state. a) Save

1. Convert a client-requested command to a network protocol format 2. Append the protocol content string to the variable server.aof_buf 3. When the AOF system reaches the set condition, the Aof_fsync is called ( File description symbol) writes data to disk

the setting condition mentioned in the third step is the key point of aof performance. Currently, Redis supports three preservation condition mechanisms:

1.aof_fsync_no: Do not save

in this mode, each command that executes a client appends the protocol string to server.aof_buf, but does not write to disk.

writes occur only in:

1.Redis is shut down normally

2.Aof function off

3. The system write cache is full, or the background timer save operation is executed

all three of these conditions block the main process, causing the client request to fail.

2.aof_fsync_everysecs: Save once per second

written save by the back-table process call does not block the main process. In the event of an outage, the maximum data loss is within 2s. This is also the default setting option

3.aof_fsync_always: Save once for each command executed

in this mode, each client instruction is guaranteed to be saved, ensuring that the data is not lost. But the downside is that performance is greatly reduced because each operation is exclusive and needs to block the main process.

b) Read

aof saves data in the data protocol format, so as long as the data in the AOF is converted to a command, the impersonation client is re-executed again, and all the database states can be restored. the process of reading is: 1. Create a simulated client 2. Read the aof saved text, restore the data as the original command and the original parameters. Then use the impersonated client to issue this command request. 3. Continue to the second step until you have finished reading the aof file
aof need to save all the commands to disk, the file will become more and more large over time. Reading can also become very slow. Redis provides a aof override mechanism to help reduce the size of the file. The idea of implementation is: [Plain]View Plaincopy

Lpush List 1 2 3 4 5

[Plain]View Plaincopy

Lpop List

[Plain]View Plaincopy

Lpop List

[Plain]View Plaincopy

Lpush List 1

the initial save to the AoF file will be four instructions. But after aof rewriting, it becomes an instruction: [Plain]View Plaincopy

Lpush List 1 3 4 5

at the same time, the notion of aof rewriting the cache is added to the aof rewrite, which does not affect aof writing. That is , when Redis turns on aof, in addition to writing the command format data to the AoF file, it writes to the AOF rewrite cache. This allows the aof write, rewrite to be isolated, ensuring that the rewrite does not block writing.

c) AOF Rewrite process

1.AOF Rewrite completion sends a completed signal to the main process 2. All data in the AOF rewrite cache is written to the file 3. Overwrite the original aof file with the new AoF file.

d) AoF Disadvantages

1.AOF files are typically larger than the same dataset's Rdb file in 2.AOF mode with performance in the RDB mode, depending primarily on the Fsync mode selected by AoF

The following is a partial graphical representation of the server-side persistence when the client requests redisserver.

Iv. Implementation of the Redis database
Redis is a key-value pair database, called a key space. To achieve this storage in kv form, Redis uses two types of data structures: 1, dictionary
The Redis dictionary uses a hash table implementation, which was not intended to detail the implementation of the Redis Hashtable. But when we realized that Redis was implementing a hash table,
Provides a good rehash solution, the idea is very good, and can even be used in other applications, the name of the scheme is "progressive rehash".

The way to implement a hash table is similar, but why do all open source software always develop its own unique hash data structure?
From the study of the hash implementation of the PHP kernel to the Redis hash implementation, it is found that the application scenario determines the need to be customized for better performance. (The PHP hash implementation can be see: The Hashtable of the artifact in the PHP kernel)
A) PHP is mainly used in web scenes, the web scene for a single request data is isolated between, and the number of hashes is limited, then a rehash is also very fast.
so PHP inside The kernel uses blocking form rehash, which means that rehash will not be able to perform any operations on the current hash table.

Initialize the dictionary plot:
Rehash Execution flow:

Analyze Redis Architecture Design

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More