3 ways to share your Redis implementation unique count

Source: Internet
Author: User

Reproduced in: http://www.itxuexiwang.com/a/shujukujishu/redis/2016/0216/121.html?1455855118

The unique count is a very common feature in the Web site system, such as the number of unique visitor that a website needs to count daily visits (that is, Uvs). Counting is a common problem, but it can be quite complex to solve: one is that the amount that needs to be counted may be large, such as a large site with millions of people per day, a considerable amount of data, and a dimension that usually wants to expand the count, such as a daily UV, or a weekly or monthly UV, which can lead to complex calculations.

In a relational database-stored system, the only way to implement a unique count is select COUNT (distinct <item_id>), which is very simple, but if the amount of data is large, the statement execution is slow. Using relational databases Another problem is that the performance of inserting data is not high.

Redis is handy for solving this sort of counting problem, faster than relational databases, consumes less resources, and even offers 3 different approaches.

1. Based on set

Redis's set is used to hold a unique collection of data, which allows you to quickly determine whether an element exists in a collection, or to quickly calculate the number of elements in a collection, and also to merge the collection into a new collection. The following commands are involved:

The code is as follows:


Sismember Key Member # determine if member exists
Sadd Key Member # Add member to the collection
SCard Key # Gets the number of collection elements

Set-based method is simple and effective, accurate counting, wide application surface, easy to understand, its disadvantage is that the consumption of resources is relatively large (of course, compared to the relational database is much less), if the number of elements is very large (such as hundreds of billions of counts), consumption of memory is very scary.

2. Bit-based

A Redis bit can be used to implement a count of compression over the set memory height, which stores an element for information through a bit 1 or a. For example, site unique visitor count, you can put user_id as the offset of bit, set to 1 for access, use 1 MB of space can hold more than 8 million users of the day access count situation. The following commands are involved: #p # page Title #e#

The code is as follows:


Setbit key offset Value # set bit information
Getbit key Offset # get bit information
Bitcount key [Start end] # count
Bitop operation Destkey Key [key ...] # Bitmap Merge

Bit-based method is much less expensive than set space, but it requires that the element can be simply mapped to a bit offset, the applicable surface is much narrower, and the space it consumes depends on the maximum offset, regardless of the count value, if the maximum offset is large, the memory consumption is considerable.

3. Based on Hyperloglog

It is difficult to achieve a unique count of very large data volumes, but if only approximate, there are many efficient algorithms in computational science, where Hyperloglog counting is one of the most well-known algorithms, it can only use about three K of memory, to achieve billions of unique counts, And the error is controlled at about 1%. The following commands are involved:

The code is as follows:


Pfadd key element [element ...] # add Element
Pfcount key [Key ...] # count

This counting method is really magical, I have not thoroughly understood, is interested in in-depth research related articles.

These three unique counting methods provided by Redis each have their merits and demerits, and can be fully satisfied with the counts in different situations.

3 ways to share your Redis implementation unique count

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.