Preliminary Exploration of redis

Last Update:2018-10-29 Source: Internet

Author: User

Tags redis cluster

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. What is redis?

Redis is an open-source log-type, key-value database written in ansi c language that complies with the BSD Protocol, supports the network, and can be persistent based on memory, and provides APIs in multiple languages.

1. redis features

Redis is essentially a key-value type memory database, similar to memcached. The entire database is loaded into the memory for operations. The database data is regularly flushed to the hard disk for storage through asynchronous operations. Because it is a pure memory operation, redis has excellent performance and can process more than 0.1 million read/write operations per second. It is the fastest known key-value dB. Redis also supports five data structures: String, list, set, sorted set (zset), and hash. the maximum size of a single value is 1 GB.

Although redis is a nosql database, it does not have the advantages of other nosql databases: redis supports acid and redis operations are atomic, in addition, rollback is supported, and redis can be persistently stored on a local hard disk on a regular basis.

Redis can be used for cache, message, and expiration time by key. It will be automatically deleted after expiration.

Compared with memcached, redis can not only read and write data in the memory, but also persistently store the data in the memory to the hard disk. In this way, the memory data disappears after power failure to some extent.

The main disadvantage of redis is that the database capacity is limited by the physical memory and cannot be used for high-performance read/write of massive data. Therefore, redis is suitable for high-performance operations and operations with a small amount of data.

2 redis application scenarios

Because redis is fast and supports multiple data structures, its main application scenarios in the industry are cache and lightweight message queues (using list as a FIFO two-way linked list ); processing Data expiration (accurate to milliseconds): Set the expire time for the stored key-value. Therefore, it can be used as an enhanced memcached.

Ii. redis uses 1 serialization

Serialization is a process of converting the object state information into a form that can be stored or transmitted. During serialization, the object writes its current state to the temporary or persistent storage area. Later, you can re-create the object by reading or deserializing the object status from the bucket.

In network transmission and local storage, Java objects are not the data structure in the Java heap at ordinary times. Non-serialized Java objects are difficult to be transmitted across platforms, and security cannot be guaranteed.

Serialization: the process of converting a Java object to a byte sequence.

Deserialization: the process of restoring the byte sequence to a Java object.

2 protostuff serialization

The data transmitted in the network and stored in the hard disk are serialized data (binary stream). Therefore, the stored redis data must be serialized and the data retrieved from redis must be deserialized. When two processes perform remote communication, they can send different types of data to each other. Regardless of the type of data, it is transmitted on the network in the form of binary sequence. The sender needs to convert the Java object into a byte sequence before it can be transmitted over the network. The receiver needs to restore the byte sequence to a Java object.

Jedis is the preferred Java client development kit, but unlike other databases, Jedis and redis do not implement the serialization operation internally, and the programmer needs to implement the serialization operation on his own. The primary practice here is to implement the java. Io. serializable interface in the class to be serialized.

Among the numerous serialized third-party libraries, Google's protostuff is currently the most efficient (time, space) method.

Protostuff demo

Public seckill getseckill (long seckillid) {// redis operation logic try {Jedis = jedispool. getresource (); try {string key = "seckill:" + seckillid; // internal serialization is not implemented. // get-> byte []-> deserialization-> Object (seckill) // custom serialization is used. // protostuff: pojo. byte [] bytes = Jedis. get (key. getbytes (); // get bytes if (Bytes! = NULL) {// empty object seckill = schema. newmessage (); protostuffioutil. mergefrom (bytes, seckill, schema); // seckill deserialized return seckill;} finally {Jedis. close () ;}} catch (exception e) {logger. error (E. getmessage (), e);} return NULL;} Public String putseckill (seckill) {// set object (seckill) -> serialization-> byte [] Try {Jedis = jedispool. getresource (); try {string key = "seckill:" + seckill. getseckillid (); byte [] bytes = protostuffioutil. tobytearray (seckill, schema, writable buffer. allocate (Protocol buffer. default_buffer_size); // timeout cache int timeout = 60*60; // 1 hour string result = Jedis. setex (key. getbytes (), timeout, bytes); return result;} finally {Jedis. close () ;}} catch (exception e) {logger. error (E. getmessage (), e);} return NULL ;}

Iii. redis persistence

Redis has a significant advantage over other memory-type cache databases, that is, it supports persistence. Redis provides two persistence Methods: RDB persistence (the principle is to regularly dump the reids in the memory database records to the RDB persistence on the disk ), the other is aof (append only file) persistence (the principle is to write reids operation logs to files by append ).

RDB persistence refers to writing a data set snapshot in memory to a disk within a specified time interval. The actual operation is to fork a sub-process. The data set is first written to a temporary file. After the data is successfully written, replace the previous files and store them with binary compression.

Aof records every write or delete operation processed by the server in the form of logs persistently. query operations are not recorded and recorded in text. You can view detailed operation records when opening a file. Each time the server starts, it reads the log and forms a record in the memory.

1 advantages and disadvantages of RDB persistence

Advantages:

1). Once this method is used, your entire redis database will contain only one file, which is perfect for file backup. For example, you may plan to archive data for the last 24 hours every hour, and archive data for the last 30 days every day. With this backup policy, once the system encounters a catastrophic fault, we can easily recover it.

2). RDB is a good choice for disaster recovery. Because we can easily compress a single file and transfer it to other storage media.

3). maximize performance. For redis service processes, the only thing it needs to do at the beginning of persistence is to Fork sub-processes, and then the sub-processes will complete the persistence work, in this way, the service process can be greatly prevented from performing Io operations.

4). Compared with the aof mechanism, RDB can be started more efficiently if the dataset is large.

Disadvantages:

1) after all, there is a time interval and disaster recovery cannot be guaranteed. If you want to ensure high data availability, that is, to avoid data loss to the maximum extent, RDB is not a good choice. Once the system goes down before scheduled persistence, data that has not been written to the disk will be lost.

2 ). because RDB uses the Fork sub-process to help complete data persistence, when the dataset is large, it may cause the entire server to stop service for several hundred milliseconds or even one second.

2 aof persistence advantages and disadvantages

Advantages:

1). This mechanism can bring higher data security, that is, data persistence. Redis provides three synchronization policies, namely, synchronization per second, synchronization per modification, and non-synchronization. In fact, synchronization is done asynchronously every second, and the efficiency is very high. The difference is that once the system goes down, the data modified within one second will be lost. For each modification and synchronization, we can regard it as synchronization persistence, that is, each data change will be immediately recorded in the disk. It is foreseeable that this method is the least efficient. As for the absence of synchronization, there is no need to say anything. I think everyone can understand it correctly.

2) because this mechanism writes log files in the append mode, even if the log file goes down during the write process, the existing content in the log file will not be damaged. However, if we write only half of the data in this operation, the system crashes. Don't worry. Before redis starts up next time, we can use the redis-check-Aof tool to solve the data consistency problem.

3) if the log size is too large, redis can automatically enable the rewrite mechanism. That is, redis constantly writes the modification data to the old disk file in append mode, and redis also creates a new file to record which modification commands are executed during this period. Therefore, data security can be better ensured during rewrite switching.

4). aof contains a clear and easy-to-understand log file used to record all modification operations. In fact, we can use this file to rebuild the data.

Disadvantages:

1). For the same number of datasets, The aof file is usually larger than the RDB file. RDB is faster than aof in recovering big data sets.

2) According to the inability of the synchronization policy and the improvement of security performance, aof is often slower than RDB in terms of operation efficiency. In short, the efficiency of synchronization policies per second is relatively high, and the efficiency of synchronization policies is as efficient as that of RDB.

The standard of choice between the two is to see if the system is willing to sacrifice some performance, in exchange for higher cache consistency (AOF), or if it is willing to enable backup when writing frequently, in exchange for higher performance, back up data when you manually run save (RDB)

3 No persistence

Of course, redis can also disable the persistence function and only use redis as a cache.

Iv. FAQs 1. Maintain data consistency with the database when redis is used as a cache

This problem can be very shallow. After the problem is raised to the redis cluster, the solution will be more complicated. Here, we only need to preliminarily analyze a simple situation.

When using redis or some online materials, we usually do this: Read the cache first. If the cache does not exist, read the database. The pseudocode is as follows:

    Object stuObj = new Object();    public Stu getStuFromCache(String key){        Stu stu = (Stu) redis.get(key);        if(stu == null){            synchronized (stuObj) {                stu = (Stu) redis.get(key);                if(stu == null){                    Stu stuDb = db.query();                    redis.set(key, stuDb);                }            }        }        return stu;    }

The above lock is used to prevent excessive queries from going to the database layer.

Write database pseudocode:

public void setStu(){    redis.del(key);    db.write(obj);}

Data inconsistency may occur when you write the database first, then delete the cache, or delete the cache first, and then write the database.
Because the write and read operations are concurrent, there is no way to guarantee the order. If the cache is deleted and the database has not been written, another thread will read the data and the cache is blank, then read the data from the database and write it into the cache, Which is dirty data. If the database is written and the cache is deleted, the writing thread goes down and the cache is not deleted, data inconsistency also occurs.

In redis cluster or master-slave mode, data inconsistency may occur due to a certain delay in redis replication.

Here we can useDouble deletion + timeoutTo solve the problem

Redis. Del (key) operations are performed before and after the database is written, and a reasonable timeout time is set. The worst case is that there are inconsistencies within the timeout period. Of course, this situation is rare, probably because the service is down. This situation can meet the vast majority of requirements.

Of course, this policy should take into account the time consumed by the master-slave synchronization between redis and the database, so it is best to sleep for a certain period of time before the second deletion, such as 500 milliseconds, which undoubtedly increases the time consumed by writing requests.

2. What is cache penetration? Generally, the cache system caches queries based on the key. If there is no corresponding value, it should go to the backend System for query (such as DB ). If the value corresponding to the key does not exist and the number of concurrent requests to the key is large, it will put a lot of pressure on the backend system. This is called cache penetration. How to avoid it?

1. If the query result is empty, the cache time is set to a little shorter, or the data corresponding to the key is inserted and then the cache is cleared.

2. Filter keys that do not exist. You can put all possible keys in a large bitmap. This bitmap is used for filtering during queries.

3. What is a cache avalanche? When the cache server is restarted or a large number of caches are concentrated in a certain period of time, it will also put a lot of pressure on the backend system (such as DB. How to avoid it?

1. After the cache becomes invalid, lock or queue is used to control the number of read database write cache threads. For example, only one thread can query data and write cache for a key, while other threads wait.

2. Set different expiration times for different keys to make the cache expiration time as even as possible.

3. second-level cache, A1 is the original cache, A2 is the copy cache, A1 is invalid, A2 can be accessed, A1 cache expiration time is set to short-term, A2 is set to long-term (this is supplement)

4. Use redis and any language to implement a malicious login protection code. Each user ID can log on up to five times within one hour.

The queue data structure is used for implementation. The key is the user, the value is the logon time, the expiration time of each record is set to 1 hour, and the length of the maintenance queue data is 5. When a user initiates a New login request and the length of the current queue is already 5, the request is rejected.

Reference

Https://www.cnblogs.com/fidelQuan/p/4543387.html

Https://www.cnblogs.com/chenliangcl/p/7240350.html

53690312? Winzoom = 1

Http://www.cnblogs.com/Survivalist/p/8119891.html

Preliminary Exploration of redis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More