Original address: Http://www.hoterran.info/redis_kv_design
The rich data structure makes the design of Redis very interesting. Unlike relational databases, Dev and DBAs need deep communication, review per row of SQL statements, and not like memcached, without the need for DBA involvement. A Redis DBA needs to be familiar with data structures and understand usage scenarios.
Here are some examples of common suitable kv database to talk about the design of key values, and the relational database to make a comparison to find out the shortcomings of the relationship type.
User Login System
A system that records user login information, and we only leave a table after simplifying the business.
Design of relational database
Mysql> SELECT * from login;+---------+----------------+-------------+---------------------+| user_id | name | Login_times | Last_login_time |+---------+----------------+-------------+---------------------+| 1 | Ken Thompson | 5 | 2011-01-01 00:00:00 | | 2 | Dennis Ritchie | 1 | 2011-02-01 00:00:00 | | 3 | Joe Armstrong | 2 | 2011-03-01 00:00:00 |+---------+----------------+-------------+---------------------+
USER_ID the primary key of the table, name represents the user name, Login_times represents the number of logins for that user, and each time the user logs in, Login_times is increased and last_login_time is updated to the current time.
The design of Redis
The relational data into the KV database, my method is as follows:
Key table name: Primary KEY value: Column name
Value column values
It is an unwritten rule to use a colon as a separator. For example, in the Php-admin for Redis system, is the default colon division, so user:1 User:2 and other key will be divided into a group. The above relational data is then converted into KV data and recorded as follows:
Set login:1:login_times 5Set login:2:login_times 1Set login:3:login_times 2Set login:1:last_login_time 2011-1-1Set Login:2:last_login_time 2011-2-1set login:3:last_login_time 2011-3-1set login:1:name "Ken Thompson" set Login:2:name " Dennis Ritchie "Set Login:3:name" Joe Armstrong "
This allows you to obtain or modify the user's logon times and last logon time and name with a known primary key in the case of get, set.
The average user is unable to know their own ID, only know their own user name, so there must also be a mapping from name to ID, where the design is different from the above.
Set "Login:ken thompson:id" 1set "Login:dennis ritchie:id" 2set "Login:joe armstrong:id" 3
So each time the user logs on the business logic is as follows (Python version), R is the Redis object, name is the user name that has been learned.
#获得用户的iduid = R.get ("login:%s:id"% name) #自增用户的登录次数ret = R.INCR ("login:%s:login_times"% uid) #更新该用户的最后登录时间ret = R.set (" Login:%s:last_login_time "% uid, Datetime.datetime.now ())
If the requirement is just a known ID, update or get the last login time of a user, number of logins, relationship type and KV database no difference. One through Btree PK, one through the hash, the effect is very good.
Suppose you have the following requirements to find the N users who have recently logged in. Developers to see, or relatively simple, a SQL is done.
SELECT * FROM Login ORDER BY last_login_time desc limit N
After the DBA understands the requirements, consider the future table if it is larger, so build an index on the last_login_time. The execution plan accesses N records from the far right of index Leafblock, then returns n times, which works well.
After two days, another need to know who is the most logged-in number of people. How to deal with the same relationship type? Dev says simple
SELECT * FROM Login ORDER BY login_times desc limit N
The DBA looks at it and builds an index on the login_time. Do you feel a bit of a problem, on each of the fields on the table is a fine primer.
The inflexible data storage of relational database is the source of the problem, the data only has one storage method, that is the heap table arranged by row. A unified data structure means you have to use an index to change the access path of the SQL to quickly access a column, and the increase in access paths means you have to use statistics to assist, so a whole bunch of problems arise.
No index, no statistical plan, no execution plan, this is the KV database.
How does Redis meet the above requirements? For the requirements of the latest n data, the post-LIFO feature of the list is well suited. We add a piece of code after the login code above, maintain a linked list of logins, control his length, so that the most recent n logged users are saved.
#把当前登录人添加到链表里ret = R.lpush ("Login:last_login_times", uid) #保持链表只有N位ret = Redis.ltrim ("Login:last_login_times", 0, N-1)
This requires the ID of the latest login, as in the following code
Last_login_list = R.lrange ("Login:last_login_times", 0, N-1)
In addition, sorted set is very suitable for those who want the most login times, such as sorting, standings, and we store the number of users and logins in a sorted set.
Zadd login:login_times 5 1zadd login:login_times 1 2zadd login:login_times 2 3
This way, if a user logs in, an additional sorted set is maintained, and the code is
#对该用户的登录次数自增1ret = R.zincrby ("Login:login_times", 1, UID)
So how to get the most logged-in users, in reverse order ranked in the nth user can
ret = R.zrevrange ("Login:login_times", 0, N-1)
As you can see, Dev needs to add 2 lines of code, and DBAs don't have to think about indexes or anything.
Tag system
tag in the Internet application is particularly common, if the traditional relational database to design a bit nondescript. Let's look at the advantages of Redis in the case of finding a book.
Design of relational database
Two tables, a book of details, a tag table, indicating the tag of each, there is more than one tag.
Mysql> SELECT * FROM book;+------+-------------------------------+----------------+| ID | name | author |+------+-------------------------------+----------------+| 1 | The Ruby Programming Language | Mark Pilgrim | | 1 | Ruby on Rail | David Flanagan | | 1 | Programming Erlang | Joe Armstrong |+------+-------------------------------+----------------+mysql> select * from tag;+---------+ ---------+| tagname | book_id |+---------+---------+| Ruby | 1 | | Ruby | 2 | | Web | 2 | | Erlang | 3 |+---------+---------+ If there is such a demand, looking for is Ruby is also a web-based book, what if the relational database will be treated?
Select B.name, b.author from tag t1, Tag t2, book Bwhere t1.tagname = ' web ' and t2.tagname = ' ruby ' and t1.book_id = t2.b ook_id and b.id = t1.book_id
Tag table since the Association 2 times and then associated with the book, this SQL is still more complex, if the requirement is Ruby, but not a web-related books?
Relational data is not really suitable for these collection operations.
The design of Redis
First the book data must be stored, as above.
Set Book:1:name "The Ruby programming Language" Set Book:2:name "Ruby on Rail" set Book:3:name "programming Erlan G "Set Book:1:author" Mark Pilgrim "set Book:2:author" David Flanagan "set Book:3:author" Joe Armstrong "
Tag table we use collections to store data, because the collection is good at intersection, set
Sadd Tag:ruby 1sadd tag:ruby 2sadd tag:web 2sadd Tag:erlang 3
So, a book that belongs to Ruby and the Web?
Inter_list = Redis.sinter ("Tag.web", "Tag:ruby")
A book that belongs to Ruby but does not belong to the Web?
Inter_list = Redis.sdiff ("Tag.ruby", "Tag:web")
A collection of books that belong to Ruby and belong to the Web?
Inter_list = Redis.sunion ("Tag.ruby", "Tag:web")
Simple enough.
From the above 2 examples can be seen in some scenarios, relational database is not very suitable, you may be able to design a system to meet the needs, but always feel strange, a kind of mechanically feeling.
In particular, log in to the system as an example, frequently indexing the business. In a complex system, the DDL (creating an index) has the potential to change the execution plan. The problem is that it is difficult to estimate the complexity of the old systems that lead to other SQL using different execution plans and complex business. It's too hard to ask the DBA to understand all of the SQL in the system. This problem is particularly serious in Oracle, and every DBA is expected to meet it. For MySQL systems such as this, the DDL is inconvenient (although there is now a method for online DDL). When I hit the big table, the DBA got up early in the morning to operate in the low peak period. This requirement is well handled in Redis, where the DBA estimates only the capacity.
The future OLTP system should be a close combination of KV and relational type.
Discussion on key value design of Redis database