First, because Redis is being used in recent projects, a deep understanding of redis design and implementation is needed in order to better understand redis and apply it in suitable business scenarios. My analysis process is based on the entry from main, and the gradual in-depth analysis of the Redis startup process. At the same time, according to the Redis initialization process, understand the functions and principles of each Redis module. Ii. Redis Start-up process 1. Initialize the server variable, set the Redis-related default value 2. Reads the configuration file and receives the parameters passed in the command line, replacing the default value of server Settings 3. Initializes the server function module. In this step, we initialize the process signal processing, client linked list, shared object, initialize data, initialize the network connection, etc. 4. Overloading data from an RDB or aof 5. Pre-start of network monitoring service 6. Turn on event monitoring and start accepting requests from clients to start the process by viewing it more intuitively. The following is a detailed understanding of each module during the startup process. (At present, only the background threading system and the slow query log system are analyzed.) Redis data Persistence Scheme Many people say a problem when using Redis, that is, how does Redis go down? Data loss and so on. Now look at the data persistence solution provided by Redis and analyze the pros and cons through the principles. Finally, the application scenario for Redis is available. 1.RDB Persistence Scheme When Redis is running, the RDB program saves the current in-memory db snapshot to disk, and when Redis needs to be restarted, the RDB program restores the database by overloading the Rdb file. As can be seen from the above description, the RDB mainly includes two functions: the implementation of the RDB can be seen src/rdb.ca) Save (Rdbsave)
The
Rdbsave is responsible for saving the in-memory database data to disk in the RDB format and replacing the existing Rdb file if the Rdb file already exists. The main process is blocked during the time the Rdb file is saved, and the new client request will not be processed until the save is complete. To prevent the main process from blocking, Redis provides the Rdbsavebackground function. Call Rdbsave in the new child process, which sends a signal to the main process when the save is complete, while the master process can continue to process the new client request.
b) Read (rdbload)
When Redis starts, it determines whether to read the Rdb file and save its objects in memory, based on the configured persistence mode. In the process of loading an RDB, each load of 1000 keys processes a client request that has been waiting to be processed, but currently only commands for the subscription function (PUBLISH, SUBSCRIBE, Psubscribe, unsubscribe, punsubscribe) are processed. All others return an error message. Because the Publish subscription feature is not written to the database, that is, the Redis database is not present. The disadvantage of the RDB: when you say the RDB disadvantage, you need to mention that the RDB has a savepoint concept. In the default redis.conf, you can see the default configuration: #save <seconds> <changes>
Save 1 #如果15分钟内 with 1 keys modified
Save #如果6分钟内, 10 keys have been modified
Save 10000 #如果60秒内有10000个键被修改
This means that when any of the above conditions are met, a snapshot is saved. To ensure that IO read/write performance does not become a redis bottleneck, a larger value is typically created as a savepoint. 1. At this point, if the savepoint settings are too large, it can result in too much data loss. The save point setting is too small and can cause an IO bottleneck 2. When saving data, it can be time consuming due to large datasets, which can cause Redis to be unable to process client requests for a short period of time.
The 2.AOF persistence scheme, in the form of protocol text, logs all write commands to the database to the AoF file for the purpose of recording the database state. A) Save
1. Convert a client-requested command to a network protocol format 2. Append the protocol content string to the variable Server.aof_buf 3. When the AOF system reaches the set condition, the Aof_fsync (file descriptor) is called to write the data to the disk
The setting condition mentioned in the third step is the key point of aof performance. Currently, Redis supports three kinds of Save condition mechanism:
1.aof_fsync_no: Do not save
This mode, each execution of a client's command, will append the protocol string to server.aof_buf, but will not execute the write disk.
The Write only occurs when the
1.Redis is shut down normally by
2.Aof function off
3. The system write cache is full, or the background timer save operation is executed
All three of these conditions block the main process, causing the client request to fail.
2.aof_fsync_everysecs: Save once per second
by the back-table process call write save, does not block the main process. In the event of an outage, the maximum data loss is within 2s.
This is also the default setting option
3.aof_fsync_always: Save once for each command executed
in this mode, each client instruction is guaranteed to be saved, ensuring that the data is not lost. But the downside is that performance is greatly reduced because each operation is exclusive and needs to block the main process.
b) Read
AOF saves data in the data protocol format, so as long as the data in the AOF is converted to a command, the impersonation client is re-executed again, and all the database states can be restored. The read process is: 1. Create a simulated client 2. Read the aof saved text, restore the data as the original command and the original parameter. Then use the impersonated client to issue this command request. 3. Continue with the second step until you have finished reading the aof file aof you need to save all the commands to disk, the file will become more and more large over time. Reading can also become very slow. Redis provides a aof override mechanism to help reduce the size of the file. The idea of implementation is:
Lpush List 1 2 3 4 5
Lpop List
Lpop List
Lpush List 1
The initial save to the AoF file will be four instructions. But after aof rewriting, it becomes an instruction:
Lpush List 1 3 4 5
At the same time, the notion of aof rewriting the cache is added to the aof rewrite, which does not affect aof writing. That is, when Redis turns on aof, in addition to writing the command format data to the AoF file, it writes to the AOF rewrite cache. This allows the aof write, rewrite to be isolated, ensuring that the rewrite does not block writing.
AoF file Refresh method, there are three, reference configuration parameters Appendfsync :appendfsync always each commit a modification command calls Fsync flush to the aof file, very very slow, but also very safe; appendfsync everysec calls Fsync refresh to aof file every second, soon, but may lose less than one second of data;Appendfsync no relies on OS refresh, Redis does not actively refresh aof, This is the quickest, but the security is poor. The default and recommended refresh per second, so that both speed and security are done.
AOF may be damaged due to system reasons, Redis can no longer load this aof, you may follow the steps below to fix: First make a aof file backup, copy to other places; repair the original aof file, execute:$ redis-check-aof–fix; You can use the Diff–u command to see where files are inconsistent before and after the repair, and restart the Redis service.
c) aof rewrite process
1.AOF Rewrite completion sends a completed signal to the main process 2. All data in the AOF rewrite cache is written to the file 3. Overwrite the original aof file with the new AoF file.
d) AoF Disadvantages
1.AOF files are typically larger than the same dataset's Rdb file in 2.AOF mode with performance in the RDB mode, depending primarily on the Fsync mode selected by AoF
The following is a partial graphical representation of the server-side persistence when the client requests redisserver.
Iv. implementation of the Redis database Redis is a key-value pair database called the key space. To achieve this storage in kv form, Redis uses two types of data structures: 1, dictionary
The Redis dictionary uses a hash table implementation, which was not intended to detail the implementation of the Redis Hashtable. But when we realized that Redis was implementing a hash table,
Provides a good rehash solution, the idea is very good, and can even be used in other applications, the name of the scheme is "progressive rehash".
The way to implement a hash table is similar, but why do all open source software always develop its own unique hash data structure?
From the study of the hash implementation of the PHP kernel to the Redis hash implementation, it is found that the application scenario determines the need to be customized for better performance. (for PHP hash implementations see: Hashtable of the artifact in the PHP kernel)
A) PHP is mainly used in web scenes, the web scene for a single request data is isolated between, and the number of hashes is limited, then a rehash is also very fast.
So the PHP kernel uses blocking form rehash, which means that rehash will not be able to do anything with the current hash table.
b) In view of Redis, the resident process, receiving client requests to process transactions, and the data of the operation is related and the amount of data is large, if the use of the PHP kernel that way will appear:
When a hash table is rehash, all client requests are blocked at this time, and concurrency performance is greatly reduced.
Initializing a dictionary diagram: adding a dictionary element plot: Rehash execution flow: note: Some parts may be misunderstood, there are mistakes in the place to point out.
Redis Architecture Design