Three unique counting methods for Redis

Source: Internet
Author: User
This article mainly introduces three methods to achieve unique counting in Redis. This article describes three methods based on SET, bit, and HyperLogLog, you can refer to the unique count feature that is very common in the website system. for example, you need to count the number of visitors per day for a website, unique visitor (UV ). The counting problem is very common, but it may be very complicated to solve: first, the amount of data to be counted may be large. for example, a large site can be accessed by millions of people every day, with a large amount of data; second, we usually want to extend the counting dimension. for example, in addition to daily UV, we also want to know the weekly or monthly UV, which leads to complicated computing.

In the relational database storage system, the unique count method is select count (distinct ), It is very simple, but if the data size is large, the statement execution is very slow. Another problem with using relational databases is that the performance of data insertion is not high.

Redis is easy to solve this type of counting problem, which is faster than relational databases and consumes less resources. it even provides three different methods.

1. set-based

Redis's set is used to store a unique data set. it can be used to quickly determine whether an element exists in the set, or to quickly calculate the number of elements in a set, and can be merged into a new collection. The commands involved are as follows:

SISMEMBER key member # judge whether member has SADD key member # Add memberSCARD key to the set # obtain the number of elements in the set

The set-based method is simple, effective, accurate in counting, widely used, and easy to understand. Its disadvantage is that it consumes a lot of resources (of course, it is much less than that of relational databases ), if the number of elements is large (such as hundreds of millions of records), memory consumption is terrible.

2. bit-based

Redis bit can be used to compress the count above the set memory. it stores the existence of an element through a bit 1 or 0. For example, for the unique visitor count of a website, you can use user_id as the offset of bit and set it to 1 to indicate that there is access, and 1 MB space can be used to store the daily access count of more than 8 million users. The commands involved are as follows:

SETBIT key offset value # SETBIT information GETBIT key offset # retrieve bit information BITCOUNT key [start end] # count BITOP operation destkey key [key...] # bitmap merge

The bit-based method consumes much less space than the set method. However, it requires that elements can be mapped to a bit offset, and the applicability is much narrower. In addition, the space consumed depends on the maximum offset, it is irrelevant to the calculated value. if the maximum offset is large, the memory consumption is also considerable.

3. HyperLogLog-based

It is difficult to implement the unique Counting of a large amount of data, but if it is just an approximation, there are many efficient algorithms in computational science. HyperLogLog Counting is one of the most famous algorithms, it can only use around 12 kB of memory to achieve a unique count of hundreds of millions, and the error is controlled at around 1%. The commands involved are as follows:

PFADD key element [element...] # add the element PFCOUNT key [key...] # count

This counting method is amazing, and I have not fully understood it. if you are interested, you can study related articles in depth.

Each of the three unique counting methods provided by redis has its own advantages and disadvantages and can fully meet the counting requirements under different circumstances.

For more information about the three methods to achieve unique counter in Redis, please follow the PHP Chinese network!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.