Key-Value Pair design in Redis and Redis key-Value Design
The rich data structure makes redis design very interesting. Unlike relational databases, DEV and DBA need to communicate in depth, and review does not require DBA participation for each row of SQL statements like memcached. Apsaradb for redis DBAs must be familiar with the data structure and understand the application scenarios.
Below are some common examples suitable for kv databases to talk about key-value design, and compare them with relational databases to find out the shortcomings of relational databases.
User Logon System
A system that records user login information. After business is simplified, only one table is left.
Relational Database Design
Mysql> select * from login; + --------- + ---------------- + ------------- + --------------------- + | User_id | name | login_times | last_login_time | + --------- + ---------------- + ------------- + --------------------- + | 1 | ken thompson | 5 | 00:00:00 | | 2 | dennis ritchie | 1 | 00:00:00 | | 3 | Joe Armstrong | 2 | 00:00:00 | + --------- + ---------------- + ------------- + --------------------- + |
The primary key of the user_id table. name indicates the user name. login_times indicates the number of Logon times of the user. login_times increases automatically after each logon, while last_login_time indicates the current time.
Redis Design
The method for converting relational data into KV databases is as follows:
Key table name: primary key value: column name
Value column value
Generally, colons are used as separators, which are unwritten rules. For example, in the php-admin for redis system, keys are separated by colons by default, So keys such as user: 1 user: 2 are grouped into one group. After the above relational data is converted into kv data, the record is as follows:
Set login: 1: login_times 5 Set login: 2: login_times 1 Set login: 3: login_times 2 Set login: 1: last_login_time 2011-1-1 Set login: 2: last_login_time 2011-2-1 Set login: 3: last_login_time 2011-3-1 Set login: 1: name "ken thompson" Set login: 2: name "dennis ritchie" Set login: 3: name "Joe Armstrong" |
In this way, get and set can be used to obtain or modify the user's logon times and last logon time and name when the primary key is known.
Generally, you cannot know your id, but your username. Therefore, you must have a ing from name to id. The design here is different from the above.
Set "login: ken thompson: id" 1 Set "login: dennis ritchie: id" 2 Set "login: Joe Armstrong: id" 3 |
In this way, the business logic for each user login is as follows (python version). r is the redis object, and name is the known user name.
# Obtain the user id Uid = r. get ("login: % s: id" % name) # Number of Self-incrementing user logins Ret = r. incr ("login: % s: login_times" % uid) # Update the Last Logon Time of the user Ret = r. set ("login: % s: last_login_time" % uid, datetime. datetime. now ()) |
If you only need to know the id, update or obtain the last logon time and number of logins of a user, there is no difference between a relational database and a kv database. One is through the btree pk and the other is through the hash, and the effect is very good.
If you have the following requirements, search for the N users that have logged on recently. It is relatively simple for developers to look at. It can be done with an SQL statement.
Select * from login order by last_login_time desc limit N |
After the DBA understands the requirements, the DBA will create an index on last_login_time considering that the tables in the future are relatively large. The execution plan accesses N records from the rightmost of the index leafblock and returns the table N times.
After two days, you need to know who is the most frequent logon. How to deal with the same relational database? DEV is simple
Select * from login order by login_times desc limit N |
At the DBA's glance, an index must be created on login_time. Is there a problem? Every field in the table has a reference.
The inflexible data storage of relational databases is the source of the problem. There is only one storage method for data, that is, the heap table arranged by row. A unified data structure means that you must use indexes to change the SQL access path to quickly access a column. The addition of access paths means that you must use statistical information to assist, so there were a lot of problems.
There is no index, no statistical plan, and no execution plan. This is the kv database.
How does redis meet the above requirements? For the demand for the latest N pieces of data, the characteristics of the chain table are very suitable. After the preceding logon code, we add a piece of code to maintain a logon linked list and control its length so that the last N logon users are always saved.
# Add the current logon user to the linked list Ret = r. lpush ("login: last_login_times", uid) # Keep the linked list with only N digits Ret = redis. ltrim ("login: last_login_times", 0, N-1) |
In this way, you need to obtain the id of the latest logon user, as shown in the following code.
Last_login_list = r. lrange ("login: last_login_times", 0, N-1) |
In addition, sorted set is very suitable for those who have the most logon times. We store both users and logon times in a sorted set.
Zadd login: login_times 5 1 Zadd login: login_times 1 2 Zadd login: login_times 2 3 |
In this way, if a user logs on, an additional sorted set is maintained, and the code is as follows:
# The number of logins to this user increases by 1 Ret = r. zincrby ("login: login_times", 1, uid) |
So how can we obtain the most frequently logged-on users? Just sort the Top N users in the reverse order.
Ret = r. zrevrange ("login: login_times", 0, N-1) |
As you can see, DEV requires two lines of code, and DBA does not need to consider indexing or anything.
TAG System
Tags are especially common in Internet applications. If they are designed with traditional relational databases. Let's look at the advantages of redis in this regard using the example of searching books.
Relational Database Design
Two tables, one book details, and one tag table, indicate each tag. A book contains multiple tags.
Mysql> select * from book; + ------ + ------------------------------- + ---------------- + | Id | name | author | + ------ + ------------------------------- + ---------------- + | 1 | The Ruby Programming Language | Mark Pilgrim | | 1 | Ruby on rail | David Flanagan | | 1 | Programming Erlang | Joe Armstrong | + ------ + ------------------------------- + ---------------- +
Mysql> select * from tag; + --------- + | Tagname | book_id | + --------- + | Ruby | 1 | | Ruby | 2 | | Web | 2 | | Erlang | 3 | + --------- + |
If there is such a demand, search is ruby and web-related books. What if I use a relational database?
Select B. name, B. author from tag t1, tag t2, book B Where t1.tagname = 'web' and t2.tagname = 'Ruby 'and t1.book _ id = t2.book _ id and B. id = t1.book _ id |
The tag table is associated with the book twice. This SQL statement is still complicated. What if ruby is required, but not a web book?
Relational Data is not very suitable for these set operations.
Redis Design
First, the data in the book must be stored.
Set book: 1: name "The Ruby Programming Language" Set book: 2: name "Ruby on rail" Set book: 3: name "Programming Erlang" Set book: 1: author "Mark Pilgrim" Set book: 2: author "David Flanagan" Set book: 3: author "Joe Armstrong" |
The tag table uses a set to store data, because the set is good at intersection and Union
Sadd tag: ruby 1 Sadd tag: ruby 2 Sadd tag: web 2 Sadd tag: erlang 3 |
So, is ruby a web book?
Inter_list = redis. sinter ("tag. web", "tag: ruby ") |
Is it a ruby book, but not a web book?
Inter_list = redis. sdiff ("tag. ruby", "tag: web ") |
A collection of ruby and web books?
Inter_list = redis. sunion ("tag. ruby", "tag: web ") |
It's easy.
From the above two examples, we can see that in some scenarios, relational databases are not suitable. You may be able to design a system that meets your needs, but it is always strange, there is a sense of simplicity.
In particular, this example of logging on to the system frequently creates indexes for the business. In a complex system, ddl (index creation) may change the execution plan. As a result, other SQL statements adopt different execution plans, and the old system with complicated services is difficult to predict. SQL statements are strange. It is too difficult for DBA to understand all the SQL statements in this system. This problem is especially serious in oracle, and every DBA may have encountered it. Ddl is inconvenient for systems such as MySQL (although the online ddl method is available now ). When it comes to big tables, DBAs climbed up early in the morning and operated on low-income businesses. I did not do much about this. This kind of requirement is well handled in redis, and DBA only needs to estimate the capacity.
In the future, OLTP systems should be closely integrated with key-value pairs and relational databases.
Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.