Redis learning Summary

Source: Internet
Author: User
Tags compact epoll rehash

Redis learning Summary

Redis is a single-threaded event-based model that uses the event processing framework aeevent

1. Startup Process

The reids startup process is roughly as follows:

1. initialize the global struct server data structure, assign each member a default value, and create a command table to find the corresponding processing function of the corresponding command: initserverconfig () ---> populatecommandtable
2. If the configuration file is specified, redis. conf will be read and assigned to structserver instance: loadserverconfig ()
3. initialize other data structures and create the dict data structure corresponding to the database. Bind and listen start the network service: initserver ()
This operation also creates the aeeventloop structure. aecreatetimeevent is used to add a timer event (servercron). The first timer event is called after 1 ms, and aecreatefileevent is used to add an IO event.
This event listens to the server's Listen socket-readable event. The corresponding event processing function is accepttcphandler.
4. Load the database: appendfile or rdbfile. If aof is available, the RDB will not be loaded first; otherwise, the system will judge whether RDB exists. That is, the aof method is preferred.
5. Call aemain to monitor network sockets and respond to events.

1.1 timeeven event

The servercron function is called in the next millisecond after the event is started. The actual servercron will be reset and added to the scheduling event after execution, and will be executed every 100 milliseconds.
Response handler: servercron, timer event processing flow (this content refer to the self-http://www.w3ccollege.org/redis/redis-internal/understanding-redis-internal-the-main-structure-and-start-the-process.html)
1. print some information about non-empty dB. log function
2. If no dump data is saved to a file in the background, you can adjust the HashTables Size Based on the memory usage.
3. log output some client connection information
4. Check the idle client connection and disable the idle connection.
5. determine whether there are any background processes that are dumping data to the file
If you have determined whether the process has ended or not, you need to determine whether to start a new process dump data to the file
6. process the expired key
7. vmswapout function. If the VM option is enabled, check whether the value of the VM is exceeded. If the value of the SWAp is exceeded
8. Determine if the server is slave, connect to the master, send the sync command, and synchronize data with the master.

1.2 fileevent event

There are two types of fileevent: FD at listen and FD returned by accept. Obviously, the first FD always exists, and its response function is accepttcphandler. This processing function will call acceptcommonhandler. This function is createclient for this customer and is fd createfileevent (FD) returned by accept ), the response function of this type of event is readqueryfromclient (I think this function name is not very good, at least a handler is added later); these two types of FD events are registered well, and their response functions are: acceptcommonhandler and readqueryfromclient.
Note that the service is not actually started yet. The underlying calling of the service is epoll_ctl, that is, adding (registering) events. The real wait is in aemain. After Entering aemain, the server enters the real service stage. First, traverse the aetimeevent linked list and receive the latest event. If the event response time is earlier than the current time (that is, it has passed the time but has not been processed ), obviously, when aeapipoll is called for epoll_wait, the timeout is 0, that is, it is returned directly when no fileevent exists. Otherwise, the timeout of epoll_wait can be designed to be the latest time minus the current time, that is, you can wait for such a long time. If there is no fileevent for such a long time, it should be returned because the timeevent has arrived. If you receive fileevent within this time period, save the FD and attribute mask of these available events to the aefiredevent array (the epoll of the server uses aeeventloop-> apidata (aeapistate) -> events [AE _setsize] to save these available event structures). The returned result from aeapipoll is a for loop in the background to process all available events, that is, call the corresponding event processing function aefileevent-> rfileproc, such as acceptcommonhandler, readqueryfromclient; aefileevent-> wfileevent, sendreplytoclient. Finally, determine whether timeevent can be processed. (See figure 1 and figure 2 to understand the entire process)

The above is the main processing process of redis. The figure below is from http://www.w3ccollege.org/redis/redis-internal/redis-redis-event-library-source-code-analysis.html
The redis main framework and event handling process are clearly described.

Figure 1 redis main framework and event Process

2. redis Main Types and Query Process

2.1 type Relationship Diagram

Figure 2 main types of redis

In this figure, we can see three major components of redis:
The first part is the memory data structure (redisdb, dict, dictht, dictentry, redisobject) used for data query and storage methods (string, list, set, zset, hash; they determine whether to use the corresponding compact storage (set, list, and hash compact storage types: ziplist, intset, and, zipmap ));
The second part is the event type structure (aeeventloop, aefileevent [...], aetimeevent [unidirectional linked list], aefiredevent [...] (used to store available fileevent )).
The third part is the redisclient used to save the client [two-way linked list] information: commands sent by the user, and information such as the redisobject object creation type and command processing function corresponding to these commands.

 

2.2 search process

Through this figure, we can probably know the process of querying and updating redis: Listening to client requests through fileevent, when a FD is available, call readqueryfromclient to parse processinputbuffer (this function internally depends on the client (telnet or redis-cli) the two functions processinlinebuffer and processmultibulkbuffer are called respectively in the Command Protocol format to resolve the stored in redisclient-> querybuf to parameters one by one, and construct a redisobject object and save it to redisclient-> argv [C-> argc]). Then call processcommand to start executing the Command sent by the client, the first parameter indicates the operation that the client wants to execute. Therefore, execute lookupcommand to query the command (use the command to find the HT table readonlycommandtable) and obtain the corresponding command processing function rediscommand. rediscommandproc, and then call (c) to perform real commands (add, delete, and search) operations. Here we take the GET command as an example to explain the following: Get the getcommand function through the Table query, then, execute lookupkeyreadorreply to perform the query and return operations. First, execute lookupkeywrite (c-> dB, key). The underlying layer is still to call dictfind (dB, key ):

If (D-> HT [0]. size = 0) return NULL;/* we don't have a table at all */

If (dictisrehashing (D) _ dictrehashstep (d );

H = dicthashkey (D, key); // perform real hash calculation. The hashfunction is specified by the dicttype of dictht.

For (Table = 0; Table <= 1; Table ++ ){

Idx = H & D-> HT [Table]. sizemask;

He = D-> HT [Table]. Table [idx]; // obtain dictentry *, all hash

While (HE ){

If (dictcomparehashkeys (D, key, he-> key) // verify the key value

Return he;

He = He-> next;

}

If (! Dictisrehashing (D) return NULL;

}

After the result is obtained, addreply (C, robj) is called to enter the write return operation. It first calls _ installwriteevent to register a writable event with epoll, and the corresponding response function is sendreplytoclient, then, judge the encoding type of robj, perform getdecodedobject decoding, copy the decoded content to redisclient-> Buf (_ addreplytobuffer), and finally, when the previous event occurs, sendreplytoclient performs real write (CFD) operations. (The real wait event is the aemain function in the outermost layer, so the content will be correctly copied to C-> BUF ).
[Note that dict has a dictht [2], which indicates two hash tables. This is used to implement rehash. When HT [0], the link is too long, rehash should be performed, HT [1]. size = HT [0]. size * 2] Here we will only introduce the simple SDS query process. This type is actually a dynamic array type. The following connection introduces the redis memory storage structure, which is very detailed:
Http://www.w3ccollege.org/redis/redis-internal/redis-memory-storage-structure-analysis-2.html

 

3. Persistence:

This chapter is transferred from

[Callback]

There are two methods for redis Persistence: snapshot and Aof.

3.1 Snapshot)

The Snapshot creation time is set by the Save of the configuration file. For example, save 900 1; s has an update; or the client sends the Save or bgsave command to take snapshots. The Save operation saves snapshots in the main thread. Because redis uses a main thread to process all client requests, this method will block all client requests, so it is not recommended. Bgsave will execute dump in the background (create a sub-process to execute dump ). The process is as follows:
When the dump condition is met (or the bgsave command is received), redis calls the system fork to create a sub-process.
The parent process continues to process client requests. The child process is responsible for writing memory content to temporary files.
Due to the copy on write mechanism of liunx, the child process creates a pointer in the memory ing table to point to the same memory address of the parent process. The two processes share the same memory and physical files, when a parent process processes write requests, the OS creates a copy of the page to be modified by the parent process, instead of a shared page. Therefore, the data in the address space of the sub-process is the snapshot of the entire memory at the fork time. After the sub-process writes the snapshot to the temporary file, it replaces the original snapshot file with the temporary file, then the sub-process exits.
It can be seen that the entire memory snapshot is written to a file every time, instead of writing only the modified content.
Main functions: static int rdbsave (char * filename)
Note: During shutdown server and flushall command execution, you must also perform the dump operation.

3.2 aof (append-only file)

The Write function is used to append the database data modification command to the file. The default value is appendonly. Aof. Therefore, the granularity is minimal, which is similar to the log method. Therefore, it can be simply understood as the same thing as MySQL BINLOG, which is used to record each write operation and restore the database status in case of power failure or other problems. However, it is not Bin, but text, and a row are well written, that is, we can also manually restore data through it. When redis is restarted, It will be executed. the command stored in the aof file re-creates the content of the entire database in the memory (Note: because the file is first loaded at startup and it is associated with dump. RDB only uses one of the two, so if you create an empty one before starting. aof file, then the entire redis
The instance is empty, even if dump. RDB has data ). However, in any case, there is always the possibility of data loss. After all, the entire process is asynchronous. To reduce the risk, we can modify the configuration file and select the frequency of calling fsync () to ensure the time when logs are written to the disk.
This method also has drawbacks, that is, any write operation will be persistent, so the aof file will become larger and larger, and data initialization takes a long time to restart, in this case, you can run the bgrewriteaof [Background rewrite append only file] command to recreate the file in the background.
When you receive this command, redis saves the data in the memory to a temporary file in a similar way as a snapshot, and finally replaces the original file. For example, a total of 100 data records may have been changed after operations (for example, multiple operations on one data ), then the 10 million write operation records in aof are changed to 100 records. It is equivalent to merging all the preceding operations, and the aof is greatly reduced. [This is equivalent to a write operation that records only all keys at the end. The value is their current latest value.]
You need to note that the aof file rewriting operation does not read the old aof file, but overwrites a new aof file in the database content in the entire memory using commands, this is similar to the snapshot (Fork sub-process, copy-on-write ). The process is as follows:
Redis calls fork and now has two processes: parent and child.
The sub-process generates a write log to the temporary file based on the database snapshot in the memory.
The parent process continues to process client requests, except for writing commands to the original aof file. Cache the received write commands. This ensures that if the sub-process rewrite fails, there will be no problems.
When a child process writes the snapshot content to a temporary file by command, the child process sends a signal to notify the parent process. Then the parent process writes the cached write command to the temporary file.
Now, the parent process can replace the old aof file with a temporary file and rename it. The subsequent write commands also start to append to the new aof file.

 

4. Transactions

Main references in this Chapter
[Http://www.w3ccollege.org/redis/redis-notes/redis-study-notes-of-the-matters.html]
Redis transactions are simple and do not have all the acid features of traditional transactions. One of the main reasons is that commands in redis transactions are not executed immediately and will be queued until the exec command is published to execute all the commands. That is, redis can only ensure that the commands in the transaction initiated by one client can be executed continuously without inserting commands from other clients. In addition, it does not support rollback. the commands in the transaction can be partially successful or partially failed. The command failure is similar to the information returned when the transaction context is not executed. The method is to use the multi and exec commands to implement transactions: when a client issues the multi command in a connection, the connection enters a transaction context, the subsequent commands of the connection are not executed immediately, but are put into a queue. redis will not execute all the commands in the queue in sequence until the connection receives the EXEC command, package the running results of all commands and return them to the client. Then, the connection ends the transaction context. [The watch command is added to redis2.1 to implement optimistic locks to synchronize shared resources. If the key is changed after the watch is a key, will cause the transaction of the connection to fail]

 

5. Master-slave Replication

This chapter is transferred from
[Callback]
After the slave server is configured, slave establishes a connection with the master and sends the sync command. The master will start a background process to save database snapshots to files, whether it is the first synchronization established connection or the re-connection after the connection is disconnected, at the same time, the master process starts to collect new write commands and cache them. After the background process completes file writing, the master sends the file to slave. slave saves the file to the disk and then loads it To the memory to restore the database snapshot to slave. Then the master forwards the cached command to slave. In addition, the write commands received by the master will be sent to slave through the established connection. The command for synchronizing data from Master to slave is in the same protocol format as the command sent from client. When the master and slave are disconnected, slave can automatically establish a new connection. If the master node receives multiple
The synchronous connection Command sent by slave only starts a process to write the database image and sends it to all slave instances.
It is easy to configure the slave server. You only need to add the following configuration in the configuration file: slaveof 192.168.1.1 6379 # specify the master's IP address and port

 

Remarks

Many of the above content is summarized from the Internet, and some forget where they came from, so they are not labeled. After reading their articles, I read the code started by the server (Chapter 1) and the code used to query the execution process of the event Library (chapter 2 ), because of the time relationship, other chapters do not go to the Code, and others are directly reproduced. Writing this article is more about making quick reference when you need it in the future. It is also a summary of the study over the past few days, and finally has a substantial output. I would like to thank the Internet and those who have shared their knowledge.
[Note: redis code is relatively small, and the main architecture is also relatively clear, so the code is easier to read, the first read from the redis-benchmark.c, because it has less code, however, it includes the implementation of the redis event-driven mechanism, so it is helpful for reading redis-server later. The following three reference connections have basically all the topics of redis, which are also very detailed and clear. We strongly recommend that you pay attention to redis]

 

References

Http://www.w3ccollege.org/category/redis

Http://www.hoterran.info/

Http://www.petermao.com/

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.