15 minutes to introduce the REDIS data structure

Source: Internet
Author: User
Keywords tags we strings can

Here is a translation of the Redis Official document "A fifteen minute introduction to Redis data Types", as the title says, The purpose of this article is to allow a beginner to have an understanding of the Redis data structure through 15 minutes of simple learning.

Redis is a kind of "key/value" type data distributed NoSQL database system, characterized by high-performance, persistent storage, to adapt to high concurrent application scenarios. It started late and developed rapidly and has been adopted by many large organizations, such as GitHub, to see who is using it.
This article translates from an official document from Redis: A fifteen minute introduction to Redis data types
Convenient for interested friends, quickly introduce redis data types.

In English and Chinese, if there are omissions please leave a message, some keywords do not translate, easy to read.

—————————————————————————————————————

You may already know Redis is not a simple key-value storage, in fact he is a data structure server that supports different types of values. In other words, you don't have to just take the string as a value that the key points to. The following data types are available as value types.

1. Binary-Safe strings string

2. Binary Secure String list

3. Binary secure String collection set of string, in other words: It is a set of no duplicate unordered element. You can think of it as a hash– in Ruby whose key equals Element,value equals ' true '.

4. Ordered set sorted set of string, similar to set set, where each element is associated with a floating-point number score (rating). The element is sorted according to score. You can think of it as a hash– in Ruby. Its key equals Element,value equals score, but the elements are always sorted in score order without extra sorting.

Redis Key

The Redis key value is binary safe, which means that any binary sequence can be used as a key value, from a simple string like "foo" to the contents of a JPEG file. An empty string is also a valid key value.

Several rules about key:

1. Too long key values are not a good idea, such as a 1024-byte key value is not a good idea, not only because of the memory consumption, but also in the data to find such a key value of the calculation is very expensive.

2. Too short a key value is usually not a good idea, if you want to use "u:1000:pwd" instead of "User:1000:password", there is no problem, but the latter is easier to read, and thus increased space consumption relative to the key object and value Object itself is very small. Of course, no one's stopping you. Be sure to save a little space with a shorter key value.

3. It is better to stick to a pattern. For example: "Object-type:id:field" is a good idea, like "User:1000:password". I like to add a dot to a multiple-word field name, like this: "Comment:1234:reply.to".

String Type

This is the simplest redis type. If you only use this type, Redis is like a memcached server that can be persisted (note: The Memcache data is saved only in memory and the data will be lost when the server restarts).

Let's play. String type:

$ redis-cli Set MyKey ' My binary safe value ' ok$ redis-cli get mykeymy binary safe value

As you can see, the set command and get command are usually used for setting and getting string values.

The value can be any kind of string (including binary data), for example, you can save a JPEG image under a key. The value cannot be longer than 1GB.

Although a string is a basic value type for Redis, you can still do some interesting things with it. For example: Atomic increments:

$ redis-cli Set counter 100OK $ redis-cli incr counter (integer) 101$ redis-cli incr counter (integer) 102$ redis-cli Counter (integer) 112

The INCR command parses the string value into an integral type, adds one, and finally saves the result as a new string value, with a similar command Incrby, DECR and Decrby. They're actually the same command internally, but they look a little different.

What does incr mean by atomic manipulation? This means that even if multiple clients issue INCR commands to the same key, it will never lead to competition. For example, the following can never happen: "Client 1 and Client 2 read" 10 at the same time, they both add to 11, and then set the new value to 11. The final value must be that when the 12,read-increment-set operation completes, the other clients do not execute any commands at the same time.

For a string, another interesting operation is the Getset command, which is the name of the line: he sets a new value for the key and returns the original value. What's the use of that? For example, your system uses the INCR command to manipulate a Redis key whenever a new user accesses it. You want to collect this information once per hour. You can getset the key and assign it a value of 0 and read the original value.

List Type

To make a clear list of data types, it is best to speak a little bit about the theoretical background, which is often used incorrectly in the information technology Community list. For example, "Python Lists" is a misnomer (named linked Lists), but they are actually arrays (the same data types are called arrays in Ruby)

In general terms, a list is a sequence of ordered elements: 10,20,1,2,3 is a list. However, the list implemented with arrays and the list implemented with linked list are very different in attributes.

Redis lists is implemented based on linked lists. This means that even if there are millions of elements in a list, the time complexity of adding an element to the head or tail is constant. Add new elements to the 10-element list header with the Lpush command, and add new elements to the head of the tens of millions of elements.

So, what's the bad news? The index access element is extremely fast in the list of array implementations, and the same operation is not so fast on the list of linked list implementations.

Redis Lists are implemented with linked Lists the for a database system it are because to being crucial to add able to a Very long list in a very fast way. Another strong advantage is, as and LL, in a moment, which Redis Lists can be taken at constant length into constant time.

The reason Redis lists is implemented with the linked list is that for database systems, the essential feature is that you can quickly add elements to a large list. Another important factor is that, as you will see: Redis lists can get constant lengths at constant times.

Redis lists Getting started

The Lpush command adds a new element to the left (head) of the list, and the Rpush command adds a new element to the right (tail) of the list. The last Lrange command can take a range of elements out of the list

$ redis-cli Rpush Messages "Hello How are you?" ok$ redis-cli rpush Messages "Fine. I ' m has fun with Redis "ok$ redis-cli rpush Messages" I should look into this NOSQL thing ASAP "ok$ redis-cli lrange Mess Ages 0 21. Hello how are you?2. Fine. I ' m have fun with REDIS3. I should look into this NOSQL thing ASAP

Note that the Lrange has two indexes, a range of first and last elements. Both indexes can be negative to tell Redis to start counting from the tail, so-1 represents the last element,-2 represents the penultimate element in the list, and so on.

As can guess from the example above, lists can is used, for instance, in order to implement a chat system. Another use are as queues in order to route messages between different. But the "key" is so you can-Redis lists every time for require to access data in the same order they are. This is not require any SQL order by operation, would be very fast, and'll scale to millions of elements even with a to Y Linux box.

As you can guess from the above example, the list can be used to implement the chat system. It can also be used as a queue for passing messages between different processes. The point is that you can access the data in the order that it was originally added. This does not require any SQL order by operation, it will be very fast, and it can easily be extended to the size of the millions element.

For example, in a rating system, such as social news website reddit.com, you can add each newly submitted link to a list and lrange the results to a simple page.

In the blog engine implementation, you can set a list for each log, push it into blog comments, and so on.

the ID to the Redis list instead of the actual data

In the example above, we press the "Object" (in this case, the simple message) directly into the Redis list, this should not normally be done, however, because objects can be referenced multiple times: for example, maintain their chronological order in a list, save its category in a collection, and appear in the other list if necessary Wait a minute.

Let's go back to Reddit.com's example and add the user-submitted links (news) to the list, and there are more reliable ways to do this:

$ redis-cli incr next.news.id (integer) 1$ redis-cli set News:1:title "Redis is simple" ok$ redis-cli set News:1:url " Http://code.google.com/p/redis "ok$ redis-cli lpush submitted.news 1OK

We add a key, it's easy to get a unique ID, then create an object from this ID-set a key for each field of the object. Finally, the ID of the new object is pressed into the Submitted.news list.

It's just his experiment. You can read all the list-related commands in the command Reference document. You can delete the elements, rotate the list, get and set the elements based on the index, and of course you can use Llen to get the list length.

Redis Collection

The Redis collection is an unordered collection whose elements are binary secure strings. The Sadd command can add a new element to the collection. There are also a number of operations related to sets, such as detecting the presence of an element, and achieving intersection, set, difference set, and so on. A case of victory thousand words:

$ redis-cli sadd myset 1 (integer) 1$ redis-cli sadd myset 2 (integer) 1$ redis-cli sadd myset 3 (integer) 1$ Smembers Myset1. 32.13. 2

I added three elements to the collection and let Redis return all the elements. As you can see, they are disordered.

Now let's check if an element exists:

$ redis-cli sismember myset 3 (integer) 1$ redis-cli sismember myset (integer) 0

"3″ is a member of this set, and" 30 "is not. Collections are particularly good for expressing the relationships between objects. For example, you can easily implement the label function with the Redis collection.

Here's a simple scenario where you associate a tag ID collection with each object you want to label, and a set of object IDs for each existing label.

For example, assuming that our news ID 1000 is tagged with the three tag 1,2,5 and 77, you can set up the following two sets:

$ redis-cli sadd news:1000:tags 1 (integer) 1$ redis-cli sadd news:1000:tags 2 (integer) 1$ redis-cli sadd news:1000:t AGS 5 (integer) 1$ redis-cli sadd news:1000:tags (integer) 1$ redis-cli sadd tag:1:objects 1000 (integer) 1$ redis-cli sad d tag:2:objects 1000 (integer) 1$ redis-cli sadd tag:5:objects 1000 (integer) 1$ redis-cli sadd tag:77:objects 1000 (integer) 1

To get all the labels for an object, this is simple:

$ redis-cli smembers news:1000:tags1. 52.13. 774.2

Some seemingly uncomplicated operations can still be easily implemented using the appropriate Redis command. For example, we might want to get a list of objects with tags 1, 2, 10, and 27. This can be done with the sinter command, where he can take the intersection out of different sets. So we just need to:

$ redis-cli sinter tag:1:objects tag:2:objects tag:10:objects tag:27:objects ... no, our dataset composed of Just one object ...

Other commands related to the collection can be found in the command reference document, which is interesting to grab a bunch of. Be sure to pay attention to the sort command, where the Redis collection and list are sortable.

: How to get a unique identity for a string

In the label example, we used the tag ID, but did not mention where the ID came from. Basically you have to assign a unique identifier to each label that you add to the system. You also want to avoid competition when multiple clients try to add the same label at the same time. Also, if the label already exists, you want to return his ID, otherwise create a new unique identity and associate it with this label.

Redis 1.4 will increase the hash type. With it, the strings and the unique IDs associated with things will be trivial, but how can we now reliably solve them with existing Redis commands?

Our first attempt (to fail) may be as follows. Suppose we want to get a unique ID for the label "Redis":

in order for the algorithm to be binary safe (just labels without considering utf8, spaces, etc.) we make SHA1 signatures on the tags. SHA1 (Redis) =b840fc02d524045429941cc15f59e41cb7be6c52. Check that the label is associated with a unique ID,


Use the command get Tag:b840fc02d524045429941cc15f59e41cb7be6c52:id to return an ID to the user if the above action returns a. The label already exists. otherwise ... Generates a new unique ID with the INCR next.tag.id command (assuming it returns 123456). Last associated tag and new ID,


SET tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456


and returns the new ID to the caller.

How wonderful, perhaps better ... Wait a minute! What happens when two clients use this set of instructions at the same time to try to get a unique ID for the label "Redis"? If time coincidentally, both of them will obtain nil from get operation, will do the Next.tag.id key to do the self operation, this key will be added two times. One of the clients returns the wrong ID to the caller. Fortunately, fixing this algorithm is not difficult, it's a sensible version:

1. In order for the algorithm to be binary safe (just label without regard to UTF8, spaces, etc.) we do SHA1 signature on the label. SHA1 (Redis) =b840fc02d524045429941cc15f59e41cb7be6c52.

2. Check if this label has been associated with a unique ID, using the command get Tag:b840fc02d524045429941cc15f59e41cb7be6c52:id

3. If the above get operation returns an ID, it is returned to the user. The label already exists.

4. Otherwise ... Generates a new unique ID with the INCR next.tag.id command (assuming it returns 123456).

5. The associated label and the new ID below (note that a new command is used)
Setnx Tag:b840fc02d524045429941cc15f59e41cb7be6c52:id 123456. If another client is faster than the current client, SETNX will not set the key. Also, when key is set successfully, SETNX returns 1, otherwise it returns 0. So ... Let's take the last step.

6. If Setnx returns 1 (success of key setting) returns 123456 to the caller, which is our tag ID, otherwise execute get tag:b840fc02d524045429941cc15f59e41cb7be6c52:id and returns its result to the caller.

ordered set

A collection is a data type that uses a high frequency, but ... They are also a bit too out of order for many problems; therefore, Redis1.2 introduces an ordered set. He is very similar to the collection, but also a binary security string collection, but this time with the associated score, and a similar lrange operation can return ordered elements, this action can only be used for an ordered set, it is, the Zrange command.

Basically ordered sets are, to some extent, the equivalent of the index of the SQL world in Redis. For example, in the reddit.com example mentioned above, there is no mention of how to generate a news combination based on user voting and time factors. We'll see how an ordered set solves this problem, but it's best to start with something simpler and clarify how this advanced data type works. Let's add a few hackers and take their birthdays as "score".

$ redis-cli zadd Hackers 1940 "Alan Kay" (integer) 1$ the redis-cli zadd hackers 1953 "Richard Stallman" (integer) 1$ Zadd hackers 1965 "Yukihiro Matsumoto" (integer) 1$ redis-cli zadd hackers 1916 "Claude Shannon" (integer) 1$ redis-cli zadd Hackers 1969 "Linus Torvalds" (integer) 1$ redis-cli zadd hackers 1912 "Alan Turing" (integer) 1

For an ordered set, it is easy to return these hackers by their birthdays, as they are already in order. An ordered set is implemented through a dual-ported data structure that contains a streamlined ordered list and a hash table, so the time complexity of adding an element is O (log (N)). That's fine, but when we need to access an ordered element, Redis doesn't have to do anything, it's already in order:

$ redis-cli zrange hackers 0-11. Alan Turing2. Claude Shannon3. Alan Kay4. Richard Stallman5. Yukihiro Matsumoto6. Linus Torvalds

Do you know Linus is younger than Yukihiro?

Anyway, I want to reverse the sort of these elements, this time using Zrevrange instead of Zrange:

$ redis-cli zrevrange hackers 0-11. Linus Torvalds2. Yukihiro Matsumoto3. Richard Stallman4. Alan Kay5. Claude Shannon6. Alan Turing

A very important tip, zsets just has a "default" order, but you can still use the sort command to sort the ordered sets differently (but this time the server is CPU intensive). One alternative to getting a variety of sorts is to add each element to multiple ordered collections at the same time.

interval Operation

The ability of an ordered set is more than that, he can operate on the interval. For example get all people born before 1950. We use the Zrangebyscore command to do:

$ redis-cli zrangebyscore hackers-inf 19501. Alan Turing2. Claude Shannon3. Alan Kay

We request Redis to return score elements between negative infinity and 1950 years (two extremes are also included).

You can also delete elements within the interval. For example, remove a hacker from an ordered set that has a birthday between 1940-1960 years.

$ redis-cli zremrangebyscore Hackers 1940 1960 (integer) 2

Zremrangebyscore This name is not good, but he is very useful, but also will return the number of deleted elements.

back to Reddit's example

Finally, go back to Reddit's example. Now we have a decent solution based on an ordered set to generate the home page. Use an ordered set to contain news for the last few days (delete old news from time to Zremrangebyscore). We use a background task to get all the elements from the ordered set, calculate the score according to the user's voting and news time, and then generate the Reddit.home.page ordered set with the news IDs and scores associations. To display the home page, we simply call zrange by lightning.

The old news is removed from the Reddit.home.page ordered collection from time to time so that our system always works on a limited set of news.

Updating an ordered set of scores

There is one last tip before closing this guide. Ordered set scores can be updated at any time. As long as the elements in an ordered set are Zadd, the score (and location) is updated, and the time complexity is O (log (N)), so even if a lot of updates are made, an ordered set is appropriate.

This guide is far from speaking, this is just the foundation from the beginning of Redis, in order to further read the command reference document.

(Responsible editor: Lu Guang)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.