15 minutes to introduce redis Data Structure-learning notes

Source: Internet
Author: User
Tags sha1

The following is a translation of the official redis document "A deleteen minute introduction to redis data types, the purpose of this article is to allow a beginner to have a basic understanding of the redis data structure through 15 minutes of simple learning.

Redis is a distributed nosql Database System Oriented to "key/value" pairs of data types. It features high performance and persistent storage, and is suitable for high-concurrency application scenarios. It started late and developed rapidly. It has been adopted by many large organizations, such as GitHub, to see who is using it.
This article is translated from an official document of redis: A deleteen minute introduction to redis Data Types
It is convenient for interested friends to quickly introduce redis data types.

If any omission exists, please leave a message. Some keywords are not translated for easy reading.

-------------------------------------

You may already know that redis is not a simple key-value storage. It is actually a Data Structure server that supports different types of values. That is to say, you don't have to treat the string as the value pointed to by the key. The following data types can be used as value types.

  • Binary secure stringString
  • List of binary secure stringsListOf string
  • Binary secure string setSetOf string, in other words: it is a group of elements that are not repeated and unordered. We can regard it as a hash in Ruby. The key is equal to element, and the value is equal to 'true '.
  • Ordered SetSorted setOf string, similar to a set, but each element is associated with a floating point score (score. Elements are sorted by score. We can regard it as a hash in ruby, where the key is equal to element and the value is equal to score, but the elements are always arranged in the order of score without any additional sorting operations.
Redis key

Redis key value is binary secure, which means that any binary sequence can be used as the key value, from a simple string like "foo" to the content of a jpeg file. A null string is also a valid key value.

Key rules:

  • A too long key value is not a good idea. For example, a 1024-byte key value is not a good idea, not only because it consumes memory, but also because it is costly to search for such key values in data.
  • Too short key values are generally not a good idea. If you want to replace "User: 1000: Password" with "u: 1000: Pwd", there is no problem, however, the latter is easier to read and the resulting space consumption is smaller than that of key object and value object. Of course, no one stops you from using shorter key values to save a little space.
  • It is best to stick to a mode. For example, "Object-type: ID: Field" is a good note, like this "User: 1000: Password ". I like to add a vertex to the field name of multiple words, like this: "comment: 1234: reply. ".
String type

This is the simplest redis type. If you only use this type, redis is like a persistent memcached server (Note: The data of memcache is only stored in the memory, and the data will be lost after the server is restarted ).

Let's take a look at the string type:

$ redis-cli set mykey "my binary safe value"OK$ redis-cli get mykeymy binary safe value

As you can see, SET command and get command are usually used to set and obtain string values.

The value can be a string of any type (including binary data). For example, you can save a JPEG image under a key. The value length cannot exceed 1 GB.

Although the string is the basic value type of redis, you can still perform some interesting operations through it. Example: Atomic increment:

$ redis-cli set counter 100OK $ redis-cli incr counter(integer) 101$ redis-cli incr counter(integer) 102$ redis-cli incrby counter 10(integer) 112

The incr command parses the string value into an integer, adds one to it, and saves the result as a new string value. Similar Commands include incrby, decr and decrby. In fact, they are the same command internally, but it looks a little different.

What does incr mean by atomic operations? That is to say, even if multiple clients issue incr commands to the same key, it will never lead to competition. For example, the following situation will never happen: "client 1 and client 2 read" 10 "at the same time, both of them add it to 11, and then set the new value to 11 』. The final value must be 12. When the read-increment-set operation is complete, other clients will not execute any commands at the same time.

Another interesting operation on the string is the GetSet command. The line name is as follows: it sets a new value for the key and returns the original value. How can this be used? For example, your system uses the incr command to operate a redis key whenever a new user accesses it. You want to collect this information every hour. You can GetSet the key, assign 0 to it, and read the original value.

List type

To clarify the type of List data, we 'd better first talk about the theoretical background. in the information technology field, the word list is often used improperly. For example, "Python lists" is named linked lists, but they are actually arrays (the same data type is called arrays in Ruby)

In general, a list is a sequence of ordered elements: 10, 20, 1, 2, and 3. However, the list implemented by arrays and the list implemented by linked list are very different in terms of attributes.

Redis lists is implemented based on linked lists. This means that even if there are millions of elements in a list, the time complexity of adding an element to the header or tail is constant. Use the lpush command to add new elements to the list header of ten elements, which is the same as adding new elements to the list header of ten millions of elements.

So what is bad news? Using indexes to access elements in the list implemented by the array is extremely fast, and the same operation is not so fast on the list implemented by the linked list.

Redis lists are implemented with linked lists because for a database system it is crucial to be able to add elements to a very long list in a very fast way. another strong advantage is, as you'll see in a moment, that redis lists can be taken at Constant Length in constant time.

The reason why redis lists uses linked list for implementation is that, for the database system, the key feature is that it can quickly add elements to a large list. Another important factor is that, as you will see, redis lists can get the constant length at the constant time.

Get started with redis lists

The lpush command can add a new element to the left (header) of the list, while the rpush command can add a new element to the right (tail) of the list. Finally, the lrange command can extract a certain range of elements from the list.

$ redis-cli rpush messages "Hello how are you?"OK$ redis-cli rpush messages "Fine thanks. I‘m having fun with Redis"OK$ redis-cli rpush messages "I should look into this NOSQL thing ASAP"OK$ redis-cli lrange messages 0 21. Hello how are you?2. Fine thanks. I‘m having fun with Redis3. I should look into this NOSQL thing ASAP

Note that lrange has two indexes, the first and last elements in a certain range. Both indexes indicate that redis starts counting from the tail, so-1 indicates the last element,-2 indicates the last element in the list, and so on.

As you can guess from the example abve, lists can be used, for instance, in order to implement a chat system. another use is as queues in order to route messages between different processes. but the key point is that you can use redis lists every time you require to access data in the same order they are added. this will not require any SQL order by operation, will be very fast, and will scale to millions of elements even with a toy Linux box.

As you can guess from the above example, list can be used to implement the chat system. It can also be used as a queue for message passing between different processes. The key is that you can access data in the order you added. This does not require any SQL order by operation. It will be very fast and easily expanded to the scale of millions of elements.

For example, in the rating system, such as the social news website Reddit.com, you can add each newly submitted link to a list and use lrange to easily paging the results.

In the blog engine implementation, you can set a list for each log, push it into the blog comment, and so on.

Push ID to redis list instead of actual data

In the above example, we directly press the "object" (simple message in this example) into the redis list, but this is usually not the case, because the object may be referenced multiple times: for example, you can maintain the chronological order of a list and save its category in a set. If necessary, it will appear in other lists.

Let's go back to the Reddit.com example and add the Link (News) submitted by the user to the list. A more reliable method is as follows:

$ redis-cli incr next.news.id(integer) 1$ redis-cli set news:1:title "Redis is simple"OK$ redis-cli set news:1:url "http://code.google.com/p/redis"OK$ redis-cli lpush submitted.news 1OK

By adding a key, we can easily get a unique auto-increment ID, and then create an object through this ID-set a key for each field of the object. Finally, press the ID of the new objectSubmitted. NewsList.

This is just a test. You can read all list-related commands in the command reference document. You can delete the element, rotate the list, obtain and set the element based on the index, and use llen to get the length of the list.

Redis set

The redis set is an unordered set, and its elements are binary secure strings. The sadd command can add a new element to the set. There are also many operations related to sets, such as checking whether an element exists, and implementing intersection, union, and difference sets. Example:

$ redis-cli sadd myset 1(integer) 1$ redis-cli sadd myset 2(integer) 1$ redis-cli sadd myset 3(integer) 1$ redis-cli smembers myset1. 32. 13. 2

I added three elements to the collection and asked redis to return all elements. As you can see, they are unordered.

Now let's check whether an element exists:

$ redis-cli sismember myset 3(integer) 1$ redis-cli sismember myset 30(integer) 0

"3" is a member of this set, and "30" is not. A set is especially suitable for expressing the relationship between objects. For example, using a redis set can easily implement the tag function.

The following is a simple solution: associate each tag object with a tag Id set, and associate a group of object IDs with each existing tag.

For example, if our news ID 1000 is added with three tags: 1, 2, 5, and 77, you can set the following two sets:

$ redis-cli sadd news:1000:tags 1(integer) 1$ redis-cli sadd news:1000:tags 2(integer) 1$ redis-cli sadd news:1000:tags 5(integer) 1$ redis-cli sadd news:1000:tags 77(integer) 1$ redis-cli sadd tag:1:objects 1000(integer) 1$ redis-cli sadd tag:2:objects 1000(integer) 1$ redis-cli sadd tag:5:objects 1000(integer) 1$ redis-cli sadd tag:77:objects 1000(integer) 1

To obtain all tags of an object, this is simple:

$ redis-cli smembers news:1000:tags1. 52. 13. 774. 2

Some seemingly non-simple operations can still be easily implemented using the corresponding redis command. For example, we may want to obtain a list of objects with tags 1, 2, 10, and 27 at the same time. This can be done using the sinter command, which can retrieve the intersection between different sets. Therefore, we only need:

$ redis-cli sinter tag:1:objects tag:2:objects tag:10:objects tag:27:objects... no result in our dataset composed of just one object 
  ...

You can find other commands related to the set in the command reference document. Be sure to pay attention to the sort command, redis set and list are sortable.

Question: How to obtain a unique identifier for a string

In the tag example, we use the tag ID, but we do not mention where the ID comes from. Basically, you have to assign a unique identifier for each tag that joins the system. You also want to avoid competition when multiple clients try to add the same tag at the same time. In addition, if a tag already exists, you want to return its ID. Otherwise, create a new unique identifier and associate it with this tag.

Redis 1.4 will increase the hash type. With it, the association between strings and unique IDs is not worth mentioning, But Now how can we use the existing redis command to solve it reliably?

The first attempt (ended in failure) may be as follows. Suppose we want to obtain a unique ID for the label "redis:

  • In order to make the algorithm binary secure (only labels without utf8, spaces, and so on), we sign the labels with sha1. Sha1 (redis) = b840fc02d524045429941cc15f59e41cb7be6c52.
  • Check whether the tag is associated with a unique ID,
    Use commandsGet Tag: b840fc02d524045429941cc15f59e41cb7be6c52: ID
  • If the above get operation returns an ID, it is returned to the user. The tag already exists.
  • Otherwise... UseIncr next. Tag. IDCommand to generate a new unique ID (assuming it returns 123456 ).
  • Last associated tag and new ID,
    Set Tag: b840fc02d524045429941cc15f59e41cb7be6c52: Id 123456
    And return the new ID to the caller.

More beautiful, maybe better... Wait! What happens when two clients use this command to obtain a unique ID for the tag "redis" at the same time? If the time happens, both of them will get nil from the get operationNext. Tag. ID keyPerform the auto-increment operation. The key is automatically added twice. One client returns the error ID to the caller. Fortunately, it is not difficult to fix this algorithm. This is a wise version:

  • In order to make the algorithm binary secure (only labels without utf8, spaces, and so on), we sign the labels with sha1. Sha1 (redis) = b840fc02d524045429941cc15f59e41cb7be6c52.
  • Check whether the tag is associated with a unique ID,
    Use commandsGet Tag: b840fc02d524045429941cc15f59e41cb7be6c52: ID
  • If the above get operation returns an ID, it is returned to the user. The tag already exists.
  • Otherwise... UseIncr next. Tag. IDCommand to generate a new unique ID (assuming it returns 123456 ).
  • Associate the tag with the new ID below (note that a new command is used)
    Setnx Tag: b840fc02d524045429941cc15f59e41cb7be6c52: Id 123456.If another client is faster than the current client, setnx will not set the key. In addition, if the key is successfully set, setnx returns 1; otherwise, 0 is returned. So... Let's perform the last operation.
  • If setnx returns 1 (the key is set successfully), 123456 is returned to the caller. This is our tag ID. Otherwise, executeGet Tag: b840fc02d524045429941cc15f59e41cb7be6c52: IDAnd return the result to the caller.
Ordered Set

A set is a frequently used data type,... For many questions, they are a little too unordered;) So redis1.2 introduces an ordered set. It is very similar to a set and is also a binary secure string set. However, this time an associated score and an operation similar to lrange can return ordered elements. This operation can only act on Ordered Sets, it is the zrange command.

Basically, Ordered Sets are equivalent to SQL World Indexes in redis to some extent. For example, in the above-mentioned Reddit.com example, we did not mention how to combine news to generate a homepage based on user voting and time factors. We will see how the sorted set solves this problem, but we 'd better start with something simpler and clarify how this advanced data type works. Let's add several hackers and use their birthday as "score ".

$ redis-cli zadd hackers 1940 "Alan Kay"(integer) 1$ redis-cli zadd hackers 1953 "Richard Stallman"(integer) 1$ redis-cli zadd hackers 1965 "Yukihiro Matsumoto"(integer) 1$ redis-cli zadd hackers 1916 "Claude Shannon"(integer) 1$ redis-cli zadd hackers 1969 "Linus Torvalds"(integer) 1$ redis-cli zadd hackers 1912 "Alan Turing"(integer) 1

For sorted sets, returning these hackers by birthday is easy because they are already sorted. An ordered set is implemented through a dual-ported data structure. It contains a streamlined ordered list and a hash table, therefore, the time complexity of adding an element is O (log (n )). This is okay, but when we need to access ordered elements, redis does not have to do anything, it is already ordered:

$ redis-cli zrange hackers 0 -11. Alan Turing2. Claude Shannon3. Alan Kay4. Richard Stallman5. Yukihiro Matsumoto6. Linus Torvalds

Do you know Linus is younger than yukihiro?

In any case, I want to sort these elements in reverse order. This time I will use zrevrange instead of zrange:

$ redis-cli zrevrange hackers 0 -11. Linus Torvalds2. Yukihiro Matsumoto3. Richard Stallman4. Alan Kay5. Claude Shannon6. Alan Turing

A very important tip: zsets only has a "default" order, but you can still use the sort command to sort the ordered set (but this time the server consumes CPU ). To get multiple sorts, you can add each element to multiple ordered sets at the same time.

Interval operation

An ordered set can be more than this. It can be operated on intervals. For example, obtain all persons born before January 1, 1950. We use the zrangebyscore command to do the following:

$ redis-cli zrangebyscore hackers -inf 19501. Alan Turing2. Claude Shannon3. Alan Kay

We request redis to return the element with score between negative infinity and 1950 (two extreme values are also included ).

You can also delete the elements in the interval. For example, a hacker who has a birthday between 1940 and 1960 is deleted from an ordered collection.

$ redis-cli zremrangebyscore hackers 1940 1960(integer) 2

Although zremrangebyscore is not a good name, it is very useful and returns the number of deleted elements.

Return to the Reddit example

Finally, return to the Reddit example. Now we have a decent solution based on Ordered Sets to generate the homepage. Use an ordered set to include news from recent days (use zremrangebyscore to delete old news from time to time ). Use a background task to obtain all elements from the sorted set, calculate the score based on the user's vote and news time, and then generate the score using the news IDs and scores Association.Reddit. Home. PageOrdered Set. To display the homepage, we only need to call zrange lightning.

From time to timeReddit. Home. PageThe purpose of deleting old news in an ordered collection is to make our system always work on a limited news set.

Update scores of an ordered set

There is another tip before you end this guide. The ordered collection scores can be updated at any time. As long as zadd is used to perform operations on the elements in an ordered set, the score (and position) is updated. The time complexity is O (log (N). Therefore, even if a large number of updates are performed, an ordered set is also suitable.

This guide is far from complete. It is only the foundation of redis. For more information, see the command reference document.

Thank you for reading. Salvatore.

15 minutes to introduce redis Data Structure-learning notes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.