"Face question" redis-related
the difference between 1.Redis and memorycache.
Redis uses a single thread, while memcached is multi-threaded,
Redis stores data in a way that uses on-site memory, and can configure virtual memory; memcached uses a pre-allocated memory pool.
Redis enables persistence and master-slave synchronization with greater disaster resilience. The memcached is only stored in memory, and the data disappears after the server fails shutdown.
Redis supports five types of data: String,list, Hash,set and Zset. And memcached is just a simple key and value
Benefits of Redis: High concurrency for data read and write, efficient storage and access to massive amounts of data, scalability and high availability of data
Redis Application Scenario: Take the latest n data operation, leaderboard application Top n operation, need to set the expiration time of the application, counter application, to obtain a certain period of time All data row value of the unique operation, real-time message system, build queue system, as cache. Five types of data structures in 2.Redis. Redisobject contains type, encoding, RefCount, LRU, *ptr
Data Structure |
Encoding Type |
Common Commands |
String |
int, EMBSTR (39), raw |
Set/setnx Name Value |
List |
Ziplist (64, 512), List |
Lpush, Rpop, Llen |
Hash |
Ziplist (64, 512), Hashtable |
Hset, Hlen, Hget |
Set |
Intset (512), Hashtable |
Sadd, SCard, Smembers, Srem |
Zset |
Ziplist (64, 128), skiplist |
Zadd, Zcard, Zrangebyscore |
Note: EMBSTR encoding creates a string object that only needs to be allocated once, and the memory-free function is called once, and Raw takes two times. You can use INCR and Incrby to generate a distributed system unique serial number ID The bidirectional list facilitates operation on both sides of the table, but its memory address is not contiguous and is prone to memory fragmentation. The ziplist is a whole block of contiguous memory that is very efficient to store. But it will trigger a memory realloc each time the data changes. So quicklist combines the advantages of a doubly linked list and a ziplist, and is a doubly-linked, loop-free list that each node is a ziplist. 3. Progressive rehash process.
Steps to Rehash
HT[1] Hash table for dictionary allocation of space if the expansion operation, then the size of ht[1] >=ht[0].used*2 2^n If the contraction operation, then ht[1] the size of >=ht[0].used 2^n
All key-value pairs saved in ht[0] are rehash to ht[1], rehash is the hash and index value of the recalculation key, the new hash is obtained by the hashfunction (key) function, the new index value: Index=hash&sizemask, The key-value pairs are then placed at the specified location in the ht[1] hash table.
When all key-value pairs of ht[0] have migrated to Ht[1] (ht[0] to empty table), release ht[0], set ht[1] to Ht[0, create a new blank hash table ht[1] for the next rehash use.
Extended and contracted conditions the server does not currently execute the Bgsave or BGREWRITEAOF command, and the load factor of the hash table >=1 or is performing a load factor >=5 rehash operation when the load factor value is less than 0.1, the program shrinks the hash table Load Factor =ht[0].used/ht[0].size
Progressive rehash: The Johasi table holds a large number of key-value pairs, one-time rehash, which is likely to lead to server outages. So it's going to be done in multiple, progressive, step-by. Taking the method of dividing and treating, the calculation of rehash key value pair is divided into the operation of adding and deleting each dictionary, which avoids the large computational amount of centralized rehash.
The main is to maintain the index counter variable rehashidx, each time the dictionary to perform additions and deletions Rehashidx value +1, when ht[0] all key value pairs are rehash to ht[1], the program will rehashidx the value of 1, indicating that the rehash operation is complete 4.rehash source code.
For the sake of performance, Redis is divided into lazy rehashing, which executes a slot rehash each time the dict is manipulated. Active rehashing: Use 1ms time per 100ms for rehash. (Servercron function), while the dictionary has a security iterator that cannot be rehash
The dictionary hash (lazy rehashing) function calls:_dictrehashstep–> Dictrehash in the _dictrehashstep function, and the Dictrehash method is called, and _ Dictrehashstep only rehash a value from ht[0] to ht[1] at a time, but because _dictrehashstep is Dictgetrandomkey, Dictfind, Dictgenericdelete, Dictadd call, so in each dict additions and deletions will be called, which undoubtedly accelerated the rehash process. In the Dictrehash function each increment rehash n elements, because the size of the auto-sizing has been set ht[1], so the main process of rehash is to traverse Ht[0], get the key, and then the key by ht[1] the size of the bucket rehash again, Ht[0] point to ht[1] after rehash, and then empty the ht[1]. Rehashidx is important in this process, which represents the subscript position of the last rehash at Ht[0].
In general, the server will incrementally rehash the database when it executes read/write commands against the database, but if the server has not executed the command for a long time, the rehash of the database dictionary may not have been able to be completed, and in order to prevent this, we need to perform an active rehash on the database.
The process for the active rehashing function call is as follows:
Servercron->databasescron–>incrementallyrehash->dictrehashmilliseconds->dictrehash, Among them, the Incrementallyrehash time is longer, the number of rehash is also more. The 1 millisecond rehash operation is performed each time, and if rehash is not completed, it will continue to execute in the next loop. 5. Persistence mechanism
RBD is the default way to write in-memory data to a binary file in a snapshot, with the default file name Dump.rdb
How the Rdbsave process is triggered save command: Blocks the Redis server process until the Rdb file is created. Bgsave command: A child process is derived and then the sub-process is responsible for creating the Rdb file, and the parent process continues to process the command request. Master received the sync command from slave timer Save (profile: 900 1 or 300 10 or 60 10000)
The command Bgsave and bgrewriteaof cannot be executed at the same time, and if Bgsave is executing, the bgrewriteaof is deferred until Bgsave execution is completed. If bgrewriteaof is executing, the server refuses to execute the bgsave command. Dirty counter: Record how many times the server has modified the database state since the last Save/bgsave command
RDB file structure: Redis occupies 5 bytes, check whether the file is an rdb file, db_version length is 4 bytes, a string represents an integer, the version number of the Rdb file is recorded; database contains 0 or any number of databases; eof:1 bytes, Marks the end of the content of the Rdb file body; check_sum:8 bytes, save checksum, calculated from the previous four sections.
Database structure: selectdb:1 bytes, which means to read a data base number; Db_number holds a database number; Key_value_pairs holds all key-value pairs in the database, consisting of type, key, value
The advantages and disadvantages of an RDB: a compact file for large-scale data recovery, low data integrity and consistency requirements, and fast recovery. Disadvantage: If Redis is down, all modifications after the last snapshot will be lost. Fork, when the data is large, is time-consuming and cannot respond to millisecond requests.
AOF: Records the database state by saving the write commands that are executed by the Redis server.
Advantages and disadvantages of AOF, the advantage of the default policy is Fsync per second, up to one second of data, disadvantage: AoF file is much larger than Rdb file, recovery is slower than RDB, inefficient operation. AOF provides updated data, and RDB provides faster recovery times
AOF Repair: If the aof file is corrupted, the program Redis-check-aof–fix repair (in the Flushappendonlyfile function), if there is a power outage, etc., will write the wrong situation recorded in the log, and then handle the error. Restart Redis and Reload, aof first, and it will save more datasets than the RDB.
AoF Write Step: Command append: Append command to aof buffer, file write, file sync
AOF override trigger mechanism: The default configuration is triggered when the aof file size is one times the size of the last rewrite and the file is larger than 64M and no child processes are running.
AoF overriding implementation principle: fork out a new process to rewrite the file (write the temporary file last rename), traverse the memory data of the new process database, and rewrite a new aof file in the way of the entire in-memory database contents by command
AOF background rewrite: The AOF rewrite program is placed in a child process, and when Redis finishes executing a command, it sends the write command to the AOF buffer and aof rewrite buffer, and then to the temporary file.
Virtual Memory: Temporarily switch infrequently accessed data from memory to disk, thereby increasing database capacity, but code complexity, restart slow, replication slow, etc., is now abandoned
Persistence optimization: Discard AOF rewrite mechanism, save Rdb+aof;pika: Suitable for data more than 50G and important, multi-threading, persistent SSD 6.reaof source.
For the size of the cache block, because the program needs to constantly perform append operations on the cache, allocating a very large amount of space is not always possible, and can result in a lot of replication work, so there are multiple spaces of size aof_rw_buf_block_size to save the command. The default size of each cache block is 10MB
If the client has command execution, and then Feedappendonlyfile function to determine whether the AOF identity is turned on, if it is turned on, put the command into Aof_buf_blocks, continue to determine whether there are child processes running, if any, it is in progress reaof, The command is placed in the aof_rewrite_buf_blocks.
Server opening is the process of looping file events and time events, and time events are performed through the Servercron () function. The function will 100ms perform a check to see if there are reaof or refresh events.
If there is a refresh event (default per second), call the Flushappendonlyfile function to write aof_buf_blocks to disk, and if there is a reaof event, call the Rewriteappendonlyfilebackground () function, which executes Fork (), calls the Rewriteappendonlyfile function subprocess, overrides the AOF file in Tmpfile, completes the child process, notifies the parent process.
The parent process captures the exit signal of the child process, and if the child process's exit status is OK, the parent process calls the Backgroundrewritedonehandler function to append aof_rewrite_buf_blocks to the temporary file and then uses rename (2) Renaming a temporary file replaces the old AOF file, but the write operation it calls blocks the main process.
By now, the background AOF rewrite has all been done. 7. Transactions and Events
Error handling for transactions: syntax errors (queued errors) are not performed, and run errors (execution errors) Other commands are still executed. Multi the transaction is turned on, the EXEC command cancels the monitoring of all keys, and the Unwatch command can be used to suppress the monitoring, and the command to cancel the transaction is Discard
The watch command can monitor one or more keys, and once a key has been modified or deleted, subsequent transactions will not be executed. Where the Exec/discard/unwatch command clears all monitoring in the connection. With the Watched_keys dictionary, you can know which database keys are being monitored, and if the monitored keys are modified, the Redis_dirty_cas identity is turned on and the CAS algorithm is implemented using watch in Redis.
The file event handler consists of sockets, I/O multiplexing programs, file event dispatchers, and event handlers. Time events are divided into timed events and periodic events, Servercron functions, periodically checking their own resources and status.
Client shutdown, Hard limit: If the size of the output buffer exceeds the size set by the hard limit, the client is immediately shut down, soft limit: If the output buffer size exceeds the size of the soft buffer, but does not exceed the hard limit, the client arrives at the soft limit of the start time, If the duration server is set for a long time, the client is shut down.
Pseudo-Client: a pseudo-client that creates a LUA script: created at server initialization and continues until the server shuts down. Pseudo-Client used when loading the AOF file: Created at load time, shutdown after loading is complete.
Command request from Send to complete step: The client sends the command request to the server, the server reads the command request, and analyzes the command parameters, the command executor based on the parameter Lookup command implementation function SetCommand, executes the implementation function to get the reply
The Servercron function is executed every 100 milliseconds by default, and it functions as follows: Update the server time cache, update the LRU clock, update the server to execute commands per second, update server memory peak records, process client resources, manage database resources, Executes the deferred bgrewriteaof, checks the run state of the persisted operation, writes the contents of the AOF buffer to the aof file, closes the asynchronous client, and increments the value of the Cronloops counter by 8. Master-slave replication
Let one server replicate another server by executing the slaveof command or setting the slaveof option. Allow multiple slave servers to have the same database copy as Mater server
Legacy copy feature (before 2.8): Sync and command propagation. Disadvantage: The efficiency of copying after disconnection is low.
New copy feature: Full resynchronization: For initial replication, like sync, let the master server create and send an RDB file, sending a write command saved in the buffer to the slave server. Partial resynchronization (PSYNC): Used for re-copying after a wire break, after re-connecting, the primary server sends a write command that is executed during the disconnection to the slave server.
The partial resynchronization feature consists of the replication offset of the master-slave server, the primary server's backlog buffer, and the server's run ID, which is the replication offset: each time the master/slave server propagates n bytes of data, the replication offset is added to the N replication backlog is a fixed-length FIFO queue maintained by the primary server. The default size is 1MB, when disconnected from the server, from the server through the Psync command to send its own copy offset offsets to the primary server, if the data after offset in the backlog buffer, the partial resynchronization operation, otherwise full resynchronization. Server run ID: at startup by 40 random hexadecimal characters Fuzhou, when disconnected, from the server to send their own run ID to the primary server, if the master and slave server running ID is the same, then partial resynchronization operation, otherwise full resynchronization operation.
Implementation of replication: Set the address and port of the primary server, establish socket connections, send ping commands, authentication (optional), send port information, synchronization, command propagation.
Heartbeat detection: During the command propagation phase, the replconf ACK < replication_offset> command is sent from the server to the primary server by default at a frequency of once per second to detect the network connection status of the master-slave server , auxiliary implement Min-slaves-to-write and Min-slaves-max-log option, detect command is missing, know the command is lost by comparing the copy offset of the master-slave server.
Master-slave replication features: a master can have multiple slave, multiple slave can connect to the same master, can also connect to other slave; master-slave replication does not block master, and master can continue to process client requests while synchronizing data and improve the scalability of the system.
Sentinel is a redis high availability solution, and a sentinel system consisting of one or more Sentinel instances can monitor any number of primary servers.
When the Sentinel is started, a network connection to the primary server is created, and the command connection is dedicated to sending commands to the primary server and accepting command replies, while subscription connections refer to the Sentinel:hello channel dedicated to subscribing to the primary server
Get server information: After establishing a connection to the primary database, the Sentinel will perform the operation: every 10 seconds the Sentinel sends an info command to the primary and from the database, and every 2 seconds the Sentinel sends its own information to the primary database and the Sentinel:hello channel from the database, The Sentinel sends pings to the primary database and from the database and other sentinel nodes per second
Election leader Sentinel: If the primary database is disconnected, the lead sentinel node is elected to initiate a failure recovery of the master and slave systems. The electoral process uses the raft algorithm. Election rules: Delete the offline or interrupted server, as far as possible to elect high priority from the server, the replication offset is large, the running ID is small. 9. Start-Up process
Initializes the server state structure, completed by the Initserverconfig function, sets the server's run ID, default run frequency, default file configuration path, run schema, default port number, default RDB persistence condition and aof persistence condition, initializes the server's LRU clock, creates a command table
Load configuration options: Specify configuration parameters or files
Initialize the server data structure: At the first step, the Initserverconfig function simply creates the command table, and the server state also contains other data structures, such as the server.clients list, server.db array, server.pubsub_ Channels dictionary, Server.lua environment, server.log slow query log. The function allocates memory for the above data structure, after which the Initserver function initializes the structure, sets the process signal processor for the server, creates the shared object, opens the server's listening port, creates a time event for the Servercron function, and, if there is a aof file, opens the AoF file, If not, create a new aof file, initialize the server's backend I/O module (bio), and there's a "bread" graph
Restore the database state: If the server has AOF persistence enabled, restore the database state with the AoF file, or use the Rdb file to restore the database state
Executing the server's event loop 10. Cluster
The Redis cluster is a distributed database scheme provided by Redis, where the cluster is sharing data through shards and provides replication and failover capabilities. Consists of multiple nodes, which are added by a handshake (cluster meet command)
The Redis cluster saves the key-value pairs in the database by sharding, the entire database in the cluster is divided into 16,384 slots (slots), each node can handle 0-16384 slots, and when each slot has nodes in process, the cluster is on-line and the command cluster addslots < slot> can assign one or more slots to the node responsible, the slot property is a binary array, if slots[i]=1, the node is responsible for processing slot I, if slots[i]=0, it means that the node is not responsible for processing slot I
The steps of the node to save the key value: first calculate which slot the key belongs to, determine whether the slot is handled by the current node, if not, return a moved error (the error is hidden but will print in stand-alone mode), to the correct node according to the error information, and the implementation of the node database can only use the No. 0 number Library
Re-sharding: Changes any number of slots that have been assigned to a node (the source node) to another node (the target node), and the key-value pair to which the associated slot belongs is also moved from the source node to the target node. can be operated online. An ask error is a part of a key value pair that is saved at the source node while the other part is saved at the destination node during node migration.
There are five messages in the cluster, and the message consists of the message header and the message body. Meet message: Indicates a cluster meet command received from the server ping message: Ping message to a node in five nodes that has not sent a ping message for the longest time to detect whether the message is online Pong: received a meet or ping message Fail message: When a node determines that the B node enters the fail state, a node broadcasts a fail message about Node B to the cluster publish message: When the node receives the Publish command, broadcasts a publish message to the cluster
The advantages of clustering, fault tolerance: solving single-point problems in single-service Redis. Extensibility: The cluster is able to achieve a good performance upgrade of the cache, such as multi-node hot deployment. Performance improvement: Embodied in the scaling process.