Reprinted from Http://blog.nosqlfan.com/html/3202.html?ref=rediszt
Redis is a distributed NoSQL database system for "key/value" type data, characterized by high performance, persistent storage, and adapting to high concurrency scenarios. It started late and developed rapidly, and has been used by many large organizations, such as GitHub, to see who is using it.
This article is an official document translated from Redis: A fifteen minute introduction to Redis data types
Quick Introduction to Redis data types for friends who are interested.
In English and Chinese, if there are omissions please leave a message, some keywords are not translated, easy to read.
—————————————————————————————————————
You may already know that Redis is not a simple key-value store, in fact he is a data structure server that supports different types of values. In other words, you don't have to just treat the string as the value that the key points to. The following data types are available as value types.
- Binary secure strings string
- Binary secure Strings List of string
- The binary secure string collection sets the of string, in other words: It is a set of non-repeating, unordered element. You can think of it as a hash– in Ruby whose key equals Element,value equals ' true '.
- The ordered collection sorted set of string, similar to set set, but each of these elements is associated with a floating-point number score (score). Element is sorted according to score. You can think of it as a hash– in Ruby whose key equals Element,value equals score, but the elements are always arranged in score order without additional sorting operations.
Redis Key
The Redis key value is binary safe, which means that any binary sequence can be used as the key value, from a simple string such as "foo" to the contents of a JPEG file. An empty string is also a valid key value.
A few rules about key:
- Too long a key value is not a good idea, such as a 1024-byte key value is not a good idea, not only because of the memory consumption, but also in the data to find such a key value is very expensive to calculate.
- Too short a key value is usually not a good idea, if you want to use "u:1000:pwd" instead of "User:1000:password", there is no problem, but the latter is easier to read, and thus increased space consumption relative to key object and value The object itself is small. Of course, no one's stopping you. Be sure to save little space with shorter key values.
- It's best to stick to a pattern. For example: "Object-type:id:field" is a good note, like this "User:1000:password". I like to add a point to the field names of multiple words, like this: "Comment:1234:reply.to".
String type
This is the simplest type of redis. If you only use this type, Redis is like a memcached server that can be persisted (note: memcache data is only stored in memory and the data is lost after the server restarts).
Let's play a little bit. String type:
$ redis-cli set mykey "my binary safe value"
OK
$ redis-cli get mykey
my binary safe value
As you can see, it is common to set and get the string value using SET command and get command.
The value can be any kind of string (including binary data), for example, you can save a JPEG image under a key. The value cannot exceed 1GB in length.
Although the string is the basic value type of Redis, you can still do some interesting things with it. For example: Atomic increment:
$ redis-cli set counter 100
OK $ redis-cli incr counter
(integer) 101
$ redis-cli incr counter
(integer) 102
$ redis-cli incrby counter 10
(integer) 112
The INCR command parses a string value into an integer, adds one, and finally saves the result as a new string value, with a similar command Incrby, DECR, and Decrby. In fact, they are the same command inside, but they look a little different.
What does incr mean by atomic manipulation? This means that even if multiple clients issue a INCR command to the same key, it will never lead to a competitive situation. For example, the following is never possible: Client 1 and Client 2 read "10" at the same time, they both add to 11, and then set the new value to 11 ". The final value must be that when the 12,read-increment-set operation finishes, the other clients do not execute any commands at the same time.
Another interesting operation on a string is the Getset command, which is the name of the line: he sets a new value for key and returns the original value. What is the use of this? For example, your system uses the INCR command to operate a Redis key whenever a new user accesses it. You want to collect this information every hour. You can getset this key and assign it a value of 0 and read the original value.
List type
To clarify the list data type, it is better to speak a bit of theoretical background, in the information technology Community list is often used incorrectly. For example, "Python Lists" is a misnomer (named linked Lists), but they are actually arrays (the same data types are called arrays in Ruby)
In general, a list is a sequence of ordered elements: 10,20,1,2,3 is a list. But lists implemented with arrays and lists implemented with the linked list are very different in terms of properties.
Redis lists is based on the linked lists implementation. This means that even if there are millions of elements in a list, the time complexity of adding an element to the head or tail is a constant level. Add a new element to the list header of 10 elements with the Lpush command, and add a new element to the head of the list of tens of millions of elements at the same speed.
So, what's the bad news? Using an index to access elements in an array-implemented list is extremely fast, and the same operation is not so fast on the list implemented by the linked list.
Redis Lists is implemented with linked Lists because for a database system it's crucial to being able to add elements to a Very long list in a very fast. Another strong advantage is, as you'll see in a moment, which Redis Lists can be taken at constant length in constant time.
The reason Redis lists is implemented with the linked list is that the most important feature for a database system is the ability to add elements to a large list very quickly. Another important factor is, as you will see: Redis lists can get constant lengths in constant time.
Getting Started with Redis lists
The Lpush command adds a new element to the left (head) of the list, and the Rpush command adds a new element to the right (tail) of the list. The last Lrange command extracts a range of elements from the list
$ redis-cli rpush messages "Hello how are you?"
OK
$ redis-cli rpush messages "Fine thanks. I‘m having fun with Redis"
OK
$ redis-cli rpush messages "I should look into this NOSQL thing ASAP"
OK
$ redis-cli lrange messages 0 2
1. Hello how are you?
2. Fine thanks. I‘m having fun with Redis
3. I should look into this NOSQL thing ASAP
Note Lrange has two indexes, a range of first and last elements. Both indexes can be negative to tell Redis to start counting from the tail, so 1 represents the last element, 2 represents the second-to-penultimate element in the list, and so on.
As can guess from the example above, lists can is used, for instance, in order to implement a chat system. Another use was as queues in order to route messages between different processes. But the key point is so you can use Redis lists every time require to access data in the same order they is added. This won't require any SQL ORDER by operation, would be very fast, and would scale to millions of elements even with a to Y Linux box.
As you can guess from the above example, list can be used to implement a chat system. You can also serve as a queue for passing messages between different processes. The point is that you can access the data in the order you added them each time. This does not require any SQL ORDER by operations, and will be very fast and easy to scale to the size of the millions element.
For example, in the rating system, such as social news website reddit.com, you can add each new submitted link to a list, with Lrange to easily page the results.
In the blog engine implementation, you can set up a list for each log, push a blog comment into the list, and so on.
Pressing the ID into the Redis list instead of the actual data
In the example above, we press the "Object" (in this case, a simple message) directly into the Redis list, but usually not, because the object may be referenced more than once: for example, maintaining its chronological order in a list, saving its category in a collection, and also appearing in other lists if necessary Wait a minute.
Let's go back to Reddit.com's example by adding a user-submitted link (news) to the list, and there's a more reliable way to do the following:
$ redis-cli incr next.news.id
(integer) 1
$ redis-cli set news:1:title "Redis is simple"
OK
$ redis-cli set news:1:url "http://code.google.com/p/redis"
OK
$ redis-cli lpush submitted.news 1
OK
We've added a key, it's easy to get a unique self-increment ID, and then create an object from this ID – set a key for each field of the object. Finally, the ID of the new object is pressed into the submitted.news list.
It's just kind. All commands related to list can be read in the command Reference document. You can delete the elements, rotate the list, get and set the elements according to the index, and, of course, use Llen to get the list length.
Redis Collection
A Redis collection is an unordered collection whose elements are binary secure strings. The Sadd command can add a new element to the collection. There are many actions associated with sets, such as detecting the existence of an element, and implementing the intersection, set, difference, and so on. A case of victory thousand words:
$ redis-cli incr next.news.id
(integer) 1
$ redis-cli set news:1:title "Redis is simple"
OK
$ redis-cli set news:1:url "http://code.google.com/p/redis"
OK
$ redis-cli lpush submitted.news 1
OK
I added three elements to the collection and let Redis return all the elements. As you can see, they are disordered.
Now let's check to see if an element exists:
$ redis-cli sismember myset 3
(integer) 1
$ redis-cli sismember myset 30
(integer) 0
"3″ is a member of this collection, and" 30 "is not. Collections are especially well suited for expressing relationships between objects. For example, a Redis collection makes it easy to implement a label function.
The following is a simple scenario: for each object that you want to tag, it is associated with a set of tag IDs, and for each existing tag, a set of object IDs associated with it.
For example, suppose our news ID 1000 was tagged with three tags tag 1,2,5 and 77, you can set the following two collections:
$ redis-cli sadd news:1000:tags 1
(integer) 1
$ redis-cli sadd news:1000:tags 2
(integer) 1
$ redis-cli sadd news:1000:tags 5
(integer) 1
$ redis-cli sadd news:1000:tags 77
(integer) 1
$ redis-cli sadd tag:1:objects 1000
(integer) 1
$ redis-cli sadd tag:2:objects 1000
(integer) 1
$ redis-cli sadd tag:5:objects 1000
(integer) 1
$ redis-cli sadd tag:77:objects 1000
(integer) 1
To get all the labels for an object, this is simple:
$ redis-cli smembers news:1000:tags
1. 5
2. 1
3. 77
4. 2
Some seemingly simple operations can still be easily implemented using the appropriate REDIS commands. For example, we might want to get a list of objects that have tags 1, 2, 10, and 27 at the same time. This can be done with the sinter command, where he can take out the intersection between different sets. So for the purpose we just need:
...
Other commands related to collections can be found in the command reference document, with an interesting grasp of a lot. Be sure to pay attention to the sort command, where Redis collections and lists are sortable.
Off topic: How to get a unique identifier for a string
In the tag example, we used the tag ID, but didn't mention where the ID came from. Basically you have to assign a unique identifier to each tag that joins the system. You also want to avoid competition when multiple clients try to add the same tag at the same time. Also, if the label already exists, you want to return his ID, otherwise create a new unique identity and associate it with this tag.
Redis 1.4 will increase the hash type. With it, the string and the unique ID associated with the matter will be trivial, but now how can we use the existing Redis command to reliably solve it?
Our first attempt (which ended in failure) may be as follows. Let's say we want to get a unique ID for the label "Redis":
- In order for the algorithm to be binary safe (just label without considering UTF8, whitespace, etc.) we do SHA1 signature on the label. SHA1 (Redis) =b840fc02d524045429941cc15f59e41cb7be6c52.
- Check if this tag is associated with a unique ID.
GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id with the command
- If the get operation above returns an ID, it is returned to the user. The label already exists.
- Otherwise... Generate a new unique ID with the INCR next.tag.id command (assuming it returns 123456).
- The last associated tag and the new ID,
SET Tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456
and returns the new ID to the caller.
How wonderful, perhaps better ... Wait a minute! What happens when two clients try to get a unique ID for the label "Redis" using this set of instructions at the same time? If time happens, they both get nil from the get operation and will do the next.tag.id key self-increment, which will be added two times. One of the clients returns the wrong ID to the caller. Fortunately, it is not difficult to fix this algorithm, which is the sensible version:
- In order for the algorithm to be binary safe (just label without considering UTF8, whitespace, etc.) we do SHA1 signature on the label. SHA1 (Redis) =b840fc02d524045429941cc15f59e41cb7be6c52.
- Check if this tag is associated with a unique ID.
GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id with the command
- If the get operation above returns an ID, it is returned to the user. The label already exists.
- Otherwise... Generate a new unique ID with the INCR next.tag.id command (assuming it returns 123456).
- The following associated label and new ID, (note the use of a new command)
setnx tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456. If another client is faster than the current client, SETNX will not set the key. Also, when key is successfully set, SETNX returns 1, otherwise 0 is returned. So... Let's do one more final operation.
- If Setnx returns 1 (key set succeeds) then 123456 is returned to the caller, which is our tag ID, otherwise execute GET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id and returns the result to the caller.
Ordered collection
A collection is a data type that uses a high frequency, but ... They are a bit too out of order for many problems, so Redis1.2 introduced an ordered set. He is very similar to the collection, and is also a binary security string collection, but this time with the associated score, and a similar lrange operation can return an ordered element, this operation can only be used in an ordered set, it is, Zrange command.
Basically ordered collections are, to some extent, the equivalent of the SQL World index in Redis. For example, in the reddit.com example mentioned above, there is no mention of how to generate the first page of a news mix based on user voting and time factors. We'll see how an ordered set solves this problem, but it's a good idea to start with something simpler to illustrate how this advanced data type works. Let's add a few hackers and put their birthdays as "score".
$ redis-cli zadd hackers 1940 "Alan Kay"
(integer) 1
$ redis-cli zadd hackers 1953 "Richard Stallman"
(integer) 1
$ redis-cli zadd hackers 1965 "Yukihiro Matsumoto"
(integer) 1
$ redis-cli zadd hackers 1916 "Claude Shannon"
(integer) 1
$ redis-cli zadd hackers 1969 "Linus Torvalds"
(integer) 1
$ redis-cli zadd hackers 1912 "Alan Turing"
(integer) 1
For ordered collections, it is easy to return these hackers by their birthdays, because they are already orderly. An ordered set is implemented by a dual-ported data structure that contains a streamlined ordered list and a hash table, so the time complexity of adding an element is O (log (N)). That's OK, but when we need to access an ordered element, Redis doesn't have to do anything, it's already in order:
$ redis-cli zrange hackers 0 -1
1. Alan Turing
2. Claude Shannon
3. Alan Kay
4. Richard Stallman
5. Yukihiro Matsumoto
6. Linus Torvalds
Do you know Linus is younger than Yukihiro?
Anyway, I would like to reverse the order of these elements, this time using Zrevrange instead of Zrange bar:
$ redis-cli zrevrange hackers 0 -1
1. Linus Torvalds
2. Yukihiro Matsumoto
3. Richard Stallman
4. Alan Kay
5. Claude Shannon
6. Alan Turing
A very important tip, zsets just has a "default" order, but you can still use the sort command to sort the ordered collection differently (but this time the server consumes CPU). For multiple sorts, an alternative is to add each element to multiple sequential collections at the same time.
Interval operation
The energy of an ordered set is more than that, and he can operate on the interval. For example, get all people born before 1950. We use the Zrangebyscore command to do:
$ redis-cli zrevrange hackers 0 -1
1. Linus Torvalds
2. Yukihiro Matsumoto
3. Richard Stallman
4. Alan Kay
5. Claude Shannon
6. Alan Turing
We request that Redis return score elements between negative infinity and 1950 years (two extrema are also included).
You can also delete elements within a range. For example, remove a hacker from an ordered collection that has a birthday between 1940-1960 years.
$ redis-cli zremrangebyscore hackers 1940 1960
(integer) 2
Zremrangebyscore is not a good name, but he is very useful and returns the number of deleted elements.
Back to Reddit's example
Finally, return to the example of Reddit. Now we have a decent solution based on an ordered set to generate the home page. Use an ordered set to contain news from the last few days (delete old news from Zremrangebyscore). Use a background task to get all the elements from an ordered collection, calculate the score based on the user vote and the news time, and then generate an ordered collection of Reddit.home.page with the news IDs and scores associations. To display the first page, we simply call Zrange with lightning.
Occasionally, the old news is removed from the reddit.home.page ordered collection to keep our system working on a limited set of news.
Updating the scores of an ordered set
There's one last tip before you end this guide. An ordered set of scores can be updated at any time. The score (and position) of the elements in an ordered set is updated with Zadd, and the time complexity is O (log (N)), so the ordered set is appropriate even with a large number of updates.
This guide is far from all, and this is just the basis for starting with Redis, read the command reference document for further details.
[Reprint] 15 minutes introduction of REDIS data structure